sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
893 stars 214 forks source link

Review CertificateParser to support new tika "x-x509-cert" contentType. #1978

Open patrickdalla opened 8 months ago

patrickdalla commented 8 months ago

I have just reviewed Pkcs7Parser code from tika.

Pkcs7 is a container spec to hold content and its signature info in same file/stream. Pkcs7Parser of tika only strips/ignores the signature and delegate the content parsing to the corresponding parser. Pkcs7Parser doesn't parse any signature and respectives certification information.

Pkcs7 is most used to save certification revogation list and certificate files itself (when included with entire certificates of certification path). The CertificateParser uses java.security.cert.CertificateFactory that can extract the certificates these files PKCS7 formatted contains.

PKCS7 is not the format of the certificate used to sign the APK.

It seems from https://issues.apache.org/jira/browse/TIKA-3205, code done after the implementation of CertificateParser, that TIKA didn't classified PEM and DER files as "x-x509-ca-cert". But now it do.

I have created in CertificateParser "application/x-pem-file" and "application/pkix-cert" mime-types to identify this kind of content, but now it seems it can use the new "application/x-x509-ca-cert" identified by Tika.

lfcnassif commented 8 months ago

Thanks @patrickdalla. I'll try to crawl certificate samples to test CertificateParser, so we can enable it by default if everything seems good.