abrom / henkei

Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)
http://github.com/abrom/henkei
MIT License
74 stars 14 forks source link

log4j vulnerability #22

Closed gsar closed 2 years ago

gsar commented 2 years ago

It would seem this gem bundles a version of log4j 1.x. though it is not directly implicated in the recent log4j v2 vulnerability, log4j v1.26 has other vulnerabilities according to the Tika security page here: https://tika.apache.org/security.html

Is there any plan to upgrade henkei to log4j 2.16 or later?

abrom commented 2 years ago

Although Tika does bundle some log4j components, to my knowledge it is not vulnerable to CVE-2021-44228 AKA "log4shell". It does not package the JndiLookup.class which contains the vulnerability:

$ unzip -l jar/tika-app-1.27.jar | grep -i jndi
     2397  06-07-2021 07:05   org/bouncycastle/cert/dane/fetcher/JndiDANEFetcherFactory$1.class
     2837  06-07-2021 07:05   org/bouncycastle/cert/dane/fetcher/JndiDANEFetcherFactory.class
     2272  11-08-2012 08:15   net/sf/ehcache/transaction/manager/selector/JndiSelector.class
      507  11-08-2012 08:15   net/sf/ehcache/transaction/manager/selector/GenericJndiSelector.class
     3702  10-23-2019 12:27   org/quartz/utils/JNDIConnectionProvider.class
     3924  09-07-2017 19:51   com/zaxxer/hikari/HikariJNDIFactory.class
    11300  12-11-2019 22:18   com/mchange/v2/c3p0/JndiRefConnectionPoolDataSource.class
     1390  12-11-2019 22:18   com/mchange/v2/c3p0/JndiRefForwardingDataSource$1.class
      902  12-11-2019 22:18   com/mchange/v2/c3p0/JndiRefForwardingDataSource$2.class
     5670  12-11-2019 22:18   com/mchange/v2/c3p0/JndiRefForwardingDataSource.class
     9340  12-11-2019 22:18   com/mchange/v2/c3p0/impl/JndiRefDataSourceBase.class
     2667  12-11-2019 22:18   com/mchange/v2/c3p0/test/JndiBindTest.class
     2186  12-11-2019 22:18   com/mchange/v2/c3p0/test/JndiLookupTest.class

I wouldn't call myself an expert on the matter, but that may be why the CVE isn't listed on the Tika security list.

I'll need to iterate over the older Tika versions to see if the same applies. At some time I'll also be looking at what is involved in updating to Tika 2.x, so will test them too.

abrom commented 2 years ago

Of course, the above is to say that although Tika doesn't appear to package the log4j vulnerability, you would need to patch your local install of the log4j-core libraries!

abrom commented 2 years ago

Looking into this further, it looks like v1.x of Tika uses v1.x of log4j (which does not implement the lookup mechanism) thus not vulnerable to this issue.

abrom commented 2 years ago

To answer your original question directly, I do plan to update to v2.x of Tika but I wouldn't say it was high on my priority list.

I'd happily accept a PR to do so, although of course given all releases of 2.x are vulnerable to the log4j2 issue I'd think best to wait until v2.2 is released (or v2.1.1 if it gets back-ported)

gsar commented 2 years ago

@abrom i had ass_u_med that henkei was bundling Tika 1.26 based on the Releases area of this repo, but it looks like you shipped version 1.27.1 without updating the release page? The security page on the Tika site (link above) says 1.26 is vulnerable, which is why I (mistakenly) raised this issue. It appears 1.27 has no known vulnerabilities, so we are probably ok.

As for Tika 2.x which is using log4j 2.x, they have an upgrade in progress that will depend on log4j 2.16 which has fixes for the vulnerability.

abrom commented 2 years ago

Oops.. yes looks like I didn't publish the release! Thanks

abrom commented 2 years ago

FYI I've released v2.2.0.1 of Henkei. v2.x of Tika does remove the "server" functionality so this is very much a breaking change.

Tika v2.2.0 does include log4j v2.15 and as such should be (for the most part) safe.. unless the environment specifically re-enables the problematic option/class!!

The Tika project haven't released a version with log4j 2.16 as yet but as you've already mentioned, it is in the works.

scarroll32 commented 2 years ago

Thanks for the update @abrom

Could you please publish the latest version to Rubygems?

❯ gem install henkei
Fetching henkei-1.27.1.gem
Successfully installed henkei-1.27.1
Parsing documentation for henkei-1.27.1
Installing ri documentation for henkei-1.27.1
Done installing documentation for henkei after 0 seconds
1 gem installed
abrom commented 2 years ago

Done 👍