Coalition ESS - Githubissues

chrisdlangton commented 5 months ago

Like EPSS, Coalition ESS collect exploit intelligence and scores CVE data

Their site seems to have graphed all CVEs and it's searchable

I find its results less a blackbox than EPSS, because their values are theoretically reproduced as they've explained their process, algorithms, and detail how anyone else might produce their own ESS

However I am fully aware of the EPSS hype and realise ESS interest is almost non existent, because hype always trumps quality I have low expectations of seeing ESS covered in the guide or notebooks

Crashedmind commented 5 months ago

thanks @chrisdlangton for the info. I was not aware of Coalition ESS - I see it was lauched in June 2023 https://www.coalitioninc.com/blog/announcing-coalition-exploit-scoring-system-ess.

The open questions that come to my mind (based on 10 minutes recon starting at zero knowledge)

Is there a research/scientific paper that validates the solution?
Is there any validation against ground truth i.e. exploits observed on the ground? From below, no, not yet.
As someone who is in the middle of using "artificial intelligence and large language modeling to scan the descriptions used" in CVEs (for a conference I'll be presenting on in May), I can say that the data is not very consistent or useful as an indicator of exploitablity.

But, the data is available on https://ess.coalitioninc.com/explore/ so maybe I or someone else will do some analysis in the future.

Doing a quick 10 minute recon..

https://www.coalitioninc.com/announcements/coalition-releases-security-vulnerability-exploit-scoring-system

Coalition ESS leverages artificial intelligence and large language modeling to scan the descriptions used within newly released CVEs (Common Vulnerabilities and Exposures) and compares them to previously published vulnerabilities to predict the likelihood of exploitability. The result is two probability scores: the Exploit Availability Probability, or the likelihood that code for an exploit will be publicly available, and the Exploit Usage Probability, or the likelihood that threat actors will use an exploit to execute an attack. These scores combined give security managers and IT professionals a prioritization list outlining which vulnerabilities pose the greatest threat, saving time and resources in an otherwise arduous decision-making process.

https://www.techtarget.com/searchsecurity/news/366570543/Coalition-Vulnerability-scoring-systems-falling-short

As a result, Coalition said, combining honeypot data with automated vulnerability prioritization is "an exciting prospect." To that end, Coalition's long-term goal is to combine honeypot traffic with machine learning to assign weights to specific activity. The honeypot data would be integrated into Coalition's own ESS model, which was announced last year and compares descriptions of newly published CVEs to previously published vulnerabilities to predict the likelihood of exploitability.

Crashedmind commented 5 months ago

Looking at specific CVEs out of curiousity to understand more (in a separate limited 10 minute effort) ...
I'm guessing that the "artificial intelligence and large language modeling to scan the descriptions used within newly released CVEs" is Named Entity Recognition. I've just done NER on all 230K CVEs so have a good understanding of that - and the issues that go with that.

This is just an observation based on the data (sample of 2) below. 'happy to see data that says otherwise.

From the sample I've seen (2 below), the Coalition ESS BETA solution is not picking up the important words e.g. I would say "may affect service confidentiality" is important - not "service".

https://ess.coalitioninc.com/cve/?id=CVE-2023-52716

https://ess.coalitioninc.com/cve/?id=CVE-2023-52382

Crashedmind commented 5 months ago

...and taking the current hot one... https://ess.coalitioninc.com/cve/?id=CVE-2024-3094 I would say "intercepting and modifying the data interaction with this library." is important.

But...

"Different, and especially opposite, opinions with the data to back them up, are especially welcome! " https://riskbasedprioritization.github.io/introduction/Introduction/?h=especially+opposite%2C+opinions+data+back+up%2C#writing-style

Crashedmind commented 5 months ago

FYI @chrisdlangton I had a slack chat earlier today with TiagoH, Coalition on this - about an analysis of Coalition data - and inclusion in this guide... and he informed me of their roadmap, planned research paper, and will reach out when ready.

So, I'll close this ticket for now - and create a new one for this analysis and inclusion when it's ready to start.

Thanks for bringing this to my attention!

FWIW, I'm not big on hype. I'm big on analysis and validation and understanding of data.

chrisdlangton commented 5 months ago

Wow, amazing response. I'll stay tuned. If timing works out I'll come here to contribute too, but worklyf and dadlyf.. This guide is pretty awesome, I've shared it around and it's getting a great reception

cmlh commented 5 months ago

☝️isn't open source intelligence which is the same issue with Palo Alto et al used by EPSS.

RiskBasedPrioritization / RiskBasedPrioritization.github.io

Coalition ESS #35