center-for-threat-informed-defense / sightings_ecosystem

Sightings Ecosystem gives cyber defenders visibility into what adversaries actually do in the wild. With your help, we are tracking MITRE ATT&CK® techniques observed to give defenders real data on technique prevalence.
https://ctid.io/sightings-ecosystem
Apache License 2.0
33 stars 8 forks source link

Is contextual data publicly available? #16

Open samwagg opened 4 months ago

samwagg commented 4 months ago

I noticed that the data submission schema contains a lot of contextual information that is not present in the CSV file, which only contains a list of techniques plus a date. I know that it was noted somewhere that the public dataset is anonymized, but is there any way to access at least some of the contextual data, such as platform and software_name?

I'm also curious as to whether your analysis is available in a more machine friendly format.

Thanks so much for this great resource!

mticmtic commented 4 months ago

Hi @samwagg, based on the agreement we made with our data contributors, we are not providing the complete data set, only TIDs and date.

And the data is hosted as a CSV, which we have found to be very machine friendly. Is CSV doesn't work for you, there are CSV to JSON converters online that you can use. This one seems like it would do the trick: https://csvjson.com/.

samwagg commented 4 months ago

@mticmtic Thanks for the quick reply! CSV is totally fine. But I mean the analytical data presented on the website, such as sightings by industry and sightings by sector.

mticmtic commented 4 months ago

@samwagg oh i see. The more robust detail presented on the website - industry, region, etc - are not publicly available. That was part of our agreement with our data contributors.

samwagg commented 4 months ago

@mticmtic Gotcha. Thank you for your patience. I just want to make sure it's totally clear that what I'm interested in is the aggregated analytical data that you present on your website already, just in a machine format. Not the raw data. For example, this is a great visualization, and it would be awesome to have the exact percentages it represents too.