arachne-threat-intel / thread

Thread is a tool for analysts to map finished reports and articles to MITRE ATT&CK®.
https://arachne.digital/thread
Apache License 2.0
2 stars 3 forks source link

Use sentence-data in database to supplement existing ML model training data #87

Open jecarr opened 5 months ago

jecarr commented 5 months ago

When a report is submitted, we either (solely) retrieve existing ML models or build/save/retrieve them.

Our training data is currently what was provided from the initial TRAM repo

It would be good to utilise the true positives, false negatives, and false positives we have in the database as training data

Only add these to the training data if they are associated with sentences from completed reports.

If #88 has been completed, please also exclude sentences with non-confident mappings from the training data.