Open yunzheng opened 2 years ago
Check my PR: I have already done some refactoring to enable this potentially.- I believe the current code would be easier to maintain...
Hi Kris, your change is quite a refactor so i will need to review it a bit more :) In the meantime I have pushed v1.2.0 so we don't keep waiting for new features or code changes.
I found out that repackaging JAR files and renaming them is a thing in Java land. For example the following elasticsearch APM package is vulnerable but not picked up by log4j-finder:
Advisory: https://discuss.elastic.co/t/apache-log4j2-remote-code-execution-rce-vulnerability-cve-2021-44228-esa-2021-31/291476 Package file: https://github.com/elastic/apm-agent-java/releases/v1.28.0/
When looking into this Zip file (which log4j-finder now supports btw), we can find the following file:
Because the extension is
esclazz
instead ofclass
, this file is not checked. However the MD5 is also not known, most likely because everything was recompiled from source so it doesn't match anything from known good or bad.There are several things I want to tackle in this issue:
How to detect renamed class files
With the assumption that the filename (without extension) should always stay the same as the Java class name, we can just check for "JndiManager.", or any other ClassNames* we want to check for that matter. So basically ignore the file extension.
This is not difficult to achieve, we could change FILENAMES to have glob patterns, like
JndiManager.*
or maybe even better just change FILENAMES to CLASS_NAMES, and only contain the class name(s), such asJndiManager
. My preference would be the latter.It then shows up as:
I will probably refactor the code soon to just look for
ClassNames
in the filename (so file extension agnostic)How to detect log4j version of unknown MD5 hashes
log4j-finder could maybe also read and parse
pom.xml
andMETADATA.MF
files when it finds them, to determine versions used in the Jar file. If have checked how most packages return their versions, and they seem to all get from that metadata files instead of returning a string from their own codebase itself.This might be a bit more work and would get a bit messy with the current code base so a refactor of some functions might be better to suit this functionality.