crocs-muni / sec-certs

Tool for analysis of security certificates and their security targets (Common Criteria, NIST FIPS140-2...).
https://sec-certs.org
MIT License
12 stars 8 forks source link

Improve cert_id heuristics #250

Closed J08nY closed 1 year ago

J08nY commented 2 years ago

Currently, we have a bunch of cert_id duplicates and a bunch of certificates without a cert_id matched.

Duplicates

Current list: https://gist.github.com/J08nY/d03714234198d16a41aa931e956ee647 MongoDB command: db.cc.aggregate([{$group: {_id: "$heuristics.cert_id", count: {$sum: 1}}}, {$sort: {count: -1}}])

Without matches

Current list: https://gist.github.com/J08nY/9162e89a60f7381a381d1c1c4e1341c4 MongoDB command: db.cc.find({"heuristics.cert_id": null}, {_id: 1})

adamjanovsky commented 2 years ago

Can we search for the cert ID in the filename of the PDF documents? I think we change their filename, but we should keep the track of the original one. This was done previously by petrs. I guess he was doing it for a reason, maybe it could identify some missing cert IDs. Besides, in methodology we claim that we do this :D

J08nY commented 1 year ago

PR #258 did most of the work here (the part that made sense), check it out for more details.