OpenCTI-Platform / opencti

Open Cyber Threat Intelligence Platform
https://opencti.io
Other
6.08k stars 903 forks source link

Provide option "do not extract" for entity aliases #4703

Open tintinkuifje opened 10 months ago

tintinkuifje commented 10 months ago

Use case

The document parser extracts entities bases on the entity name and aliases. However a lot of aliases are very generic, and not suitable for extraction. For example, some malware has an alias "Agent", Industry sectors have aliases such as "research", "software", ... Almost all reports have words like this, and it creates a lot of entities that are not related to the report. It would be nice if we could keep aliases, but disable some for extraction

Current Workaround

Deleting the aliases, but this removes information..

Proposed Solution

Provide a property on the alias, to indicate that it shouldn't be used for entity extraction

If the feature request is approved, would you be willing to submit a PR?

Not at the moment

Jipegien commented 6 months ago

To study in the context of alias enhancement.

Other possible approach: list of term to avoid in the config of the connector.