mitre-attack / tram

Threat Report ATT&CK™ Mapping (TRAM) is a tool to aid analyst in mapping finished reports to ATT&CK.
Apache License 2.0
344 stars 65 forks source link

Highlighting does not appear #71

Closed ghost closed 2 years ago

ghost commented 3 years ago

A tried multiple Chrome versions and other browsers but whenever I click on a sentence that was not previously highlighted and add a new technique to it, the highlighting does not appear.

jecarr commented 3 years ago

Some minor issues that are also related:

  1. Sentences with the same first three words as another are not selectable
  2. Some individual sentences are not selectable; they are grouped with other sentences

At time of writing, I'm using this url as an example.

Issue 1 The article has these four sentences: A: "According to Microsoft, Fancy Bear has been ramping..." B: "According to Microsoft, the new round of..." C: ""The activity we... anticipated," Microsoft's blog post reads." D: "Microsoft's blog post also details politically... "

Sentences B and D could not be highlighted and highlights A and C instead.

Issue 2 There are sentences that end in a quote - e.g.

They note that the hackers have attempted to target the Biden campaign—apparently without success—as well as "one individual formerly associated with the Trump administration." APT31 has also hit more run-of-the-mill espionage targets, including academics at 15 universities and staff accounts at 18 think tanks, including the Atlantic Council and the Stimson Center.

Sentences are split by ". " so any sentence ending differently than "." cannot be individually selected.

Original highlighting issue and both these minor issues are fixed in linked PR.


Edit - Edge cases found for above minor issues. For example this url has some closing speech marks (, - different to ", ') which are not detected in sentence splitting.

(This can be tweaked in my old approach in jecarr@5666cea where OPTIONAL_SENTENCE_DELIMITERS just needs to be expanded to include these. However this approach was dropped due to the next issue.)

Sentences like "MR. HENRY ..." are being split at the 'Mr.' part. Abbreviations are added in jecarr@df378cc to overcome this which fixes for some sentences. Other sentences where this issue still persists is because of the tokenizer's many other rules it's considering when splitting sentences. This therefore can be fixed by further training the tokenizer (see Punkt Sentence Tokenizer documentation).

MarkDavidson commented 2 years ago

Hello @timoliciouz and thank you for the bug report. TRAM has moved to https://github.com/center-for-threat-informed-defense/tram and this issue is no longer present in that repository so I am closing this issue. Thank you!