CybercentreCanada / assemblyline

AssemblyLine 4: File triage and malware analysis
https://cybercentrecanada.github.io/assemblyline4_docs/
MIT License
249 stars 15 forks source link

[UI] Sort by domain and grouping domain and URI tags #261

Open kam193 opened 2 months ago

kam193 commented 2 months ago

Is your feature request related to a problem? Please describe. A submission may contain a lot of URIs / domain, sometimes even emails. In addition, a lot of new gTLDs make file names & co appear like a domain. It leads to situations when a submission (or a file) has hundreds of extracted tags. In practice, they often form some natural groups, like subdomains, URLs for a one domain & co.

Examples:

Screenshot from 2024-09-12 11-28-51 Screenshot from 2024-09-12 11-30-56 Screenshot from 2024-09-12 12-17-48

Domains and URIs seem to be sorted alphabetically (was it always so? I feel like it's a new thing, but maybe I just missed it!), what is not exactly a natural for domains. Even when the number of items isn't huge, it's difficult to see related domains and subdomains:

Screenshot from 2024-09-12 12-24-03

Describe the solution you'd like I suggest two improvements in the UI:

  1. sorting domains and URIs by reversed domain:
    • eg. for one.two.example.com, the sorting key would be com.example.two.one - this would keep it domains and subdomains together. I think it's much more natural when looking on domains
    • the same would be good for URIs, although:
      • the IP addresses should not be reversed;
      • the path should not be reversed
      • I have no idea what to do about the schema - maybe sort it after domain? So https://example.com would be right after http://example.com?
  2. Once the number of domain/URI/etc. tags exceeds a given threshold (like 10? 20?), start grouping them and give user a button with the number of items in the group and option to show/hide a group.
    • For domains, we could group them by TLD or first-level domain.
    • For URIs, I'd suggest only grouping by the full domain (all paths and schemas would be in a group).

In any case, I would keep the current separation between malicious / suspicious / informational / safelisted tags, and sort&group each of them separately.

I think it would help to work with files generating lots of networking tags.

Describe alternatives you've considered I extensively use the system safelist (great feature! I'd also love the safelist client to use it as well) to kind of "sort out" recurring cases, but it does not solve the general issue.

Additional context

I'm not a designer, I imagine something like a button "+ N" at the end of box with the domain name would be nice to indicate a group. But again, I'm not a designer :sweat_smile: