mergebase / log4j-detector

A public open sourced tool. Log4J scanner that detects vulnerable Log4J versions (CVE-2021-44228, CVE-2021-45046, etc) on your file-system within any application. It is able to even find Log4J instances that are hidden several layers deep. Works on Linux, Windows, and Mac, and everywhere else Java runs, too! TAG_OS_TOOL, OWNER_KELLY, DC_PUBLIC
Other
638 stars 98 forks source link

java.lang.IllegalArgumentException: malformed input off : 4, length : 1 at java.base/java.lang.StringCoding.throwMalformed(StringCoding.java:698) #20

Open volker-graf opened 2 years ago

volker-graf commented 2 years ago

We tried the Scanner on a Multi-Archive-Tar file that contained a few .jar-Files and got the Message

-- Problem: XX/log4jtest.tar - java.lang.IllegalArgumentException: malformed input off : 4, length : 1
java.lang.IllegalArgumentException: malformed input off : 4, length : 1
        at java.base/java.lang.StringCoding.throwMalformed(StringCoding.java:698)
        at java.base/java.lang.StringCoding.decodeUTF8_0(StringCoding.java:885)
        at java.base/java.lang.StringCoding.newStringUTF8NoRepl(StringCoding.java:978)
        at java.base/java.lang.System$2.newStringUTF8NoRepl(System.java:2270)
        at java.base/java.util.zip.ZipCoder$UTF8.toString(ZipCoder.java:60)
        at java.base/java.util.zip.ZipCoder.toString(ZipCoder.java:87)
        at java.base/java.util.zip.ZipInputStream.readLOC(ZipInputStream.java:302)
        at java.base/java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:124)
        at com.mergebase.log4j.Log4JDetector.findLog4jRecursive(Log4JDetector.java:208)
        at com.mergebase.log4j.Log4JDetector.scan(Log4JDetector.java:442)
        at com.mergebase.log4j.Log4JDetector.analyze(Log4JDetector.java:502)
        at com.mergebase.log4j.Log4JDetector.analyze(Log4JDetector.java:497)
        at com.mergebase.log4j.Log4JDetector.main(Log4JDetector.java:84)

The TAR-File itself seems to correct.

Is it possible that there might be problems involving "Multi-Archive"-Archives with perhapes NON UTF-8-encoded Sub-Archives ?

Just a Shot in the Dark ...

stefan123t commented 2 years ago

It also spits out a couple of java.lang.IllegalArgumentException: MALFORMED and java.io.EOFException at me, for some libjli.so and libzip.so modules and the latter for jexec in an old JRE directory as probably reported by @volker-graf

juliusmusseau commented 2 years ago

Latest version probably won't have these errors because it now ignores everything that isn't a zip/ear/jar/war/aar file (with those suffixes). Would that work for you? Or do you think the log4j-detector should enter *.tar files?

(Entering .tar.gz / .tar.xz / *.tar.bz2 starts to be a pain since those require temporary disk space, whereas current approach that only enters zip files can do everything in-memory).

stefan123t commented 2 years ago

Dear Julius, I have tried it again with 2021-12-16 and it indeed skips tar balls. I only got an Out Of Memory error now after some time, probably because Multipart ZIP files and Self-Extracting Shell ZIP files can not be detected / analyzed succesfully. But the static object modules and the jexec are not reported any more. Thanks for that, it works for me. Dunno about @volker-graf being the Original Poster. Kind regards, Stefan

volker-graf commented 2 years ago

I got a few "Out Of Memory"-errors but I fixed them by adding -Xmx8G to the cmd-line-arguments.