mergebase / log4j-detector

A public open sourced tool. Log4J scanner that detects vulnerable Log4J versions (CVE-2021-44228, CVE-2021-45046, etc) on your file-system within any application. It is able to even find Log4J instances that are hidden several layers deep. Works on Linux, Windows, and Mac, and everywhere else Java runs, too! TAG_OS_TOOL, OWNER_KELLY, DC_PUBLIC
Other
638 stars 98 forks source link

- Problem /...../mce/python/lib/python3.6/test/zip_cp437_header.zip - java.lang.IllegalArgumentException: malformed input off : 14, length : 1 #56

Open HoWeBrz opened 2 years ago

HoWeBrz commented 2 years ago

I get this error message during the scanning of linus tool folders. I am not sure if the log4J-script stops working or finishes its task.

java -jar log4j-detector-2021.12.17.jar /.../ > hits.txt -- Problem /.../tools.lnx86/mce/python/lib/python3.6/test/zip_cp437_header.zip - java.lang.IllegalArgumentException: malformed input off : 14, length : 1

Is this a problem/bug and is there a solution, workaround ?

juliusmusseau commented 2 years ago

Probably it continues fine, but can you attach the problematic zip just so I can be sure?

HoWeBrz commented 2 years ago

Hi, Please find attached the zip-file.

sker65 commented 2 years ago

I also discovered these errors a lot. Mostly in "inner zips -> https://pasteimg.com/image/image.fe2QH

rgmz commented 2 years ago

I also discovered these errors a lot. Mostly in "inner zips -> https://pasteimg.com/image/image.fe2QH

The JAR(s) in question are from pkg:maven/org.bytedeco/cpython@3.9.2-1.5.5, e.g. https://repo1.maven.org/maven2/org/bytedeco/cpython/3.9.2-1.5.5/cpython-3.9.2-1.5.5-linux-x86_64.jar.

Stack trace:

$ java -jar log4j-detector-2021.12.20.jar --verbose cpython-3.9.2-1.5.5-linux-x86_64.jar 
-- github.com/mergebase/log4j-detector v2021.12.22 (by mergebase.com) analyzing paths (could take a while).
-- Note: specify the '--verbose' flag to have every file examined printed to STDERR.
...
-- Examining /tmp/cpython-3.9.2-1.5.5-linux-x86_64.jar!/org/bytedeco/cpython/linux-x86_64/lib/python3.9/test/test_importlib/zipdata01/ziptestdata.zip... 
-- Examining /tmp/cpython-3.9.2-1.5.5-linux-x86_64.jar!/org/bytedeco/cpython/linux-x86_64/lib/python3.9/test/zip_cp437_header.zip... 
-- Problem /tmp/cpython-3.9.2-1.5.5-linux-x86_64.jar!/org/bytedeco/cpython/linux-x86_64/lib/python3.9/test/zip_cp437_header.zip - java.lang.IllegalArgumentException: malformed input off : 14, length : 1
java.lang.IllegalArgumentException: malformed input off : 14, length : 1
    at java.base/java.lang.StringCoding.throwMalformed(StringCoding.java:698)
    at java.base/java.lang.StringCoding.decodeUTF8_0(StringCoding.java:885)
    at java.base/java.lang.StringCoding.newStringUTF8NoRepl(StringCoding.java:978)
    at java.base/java.lang.System$2.newStringUTF8NoRepl(System.java:2205)
    at java.base/java.util.zip.ZipCoder$UTF8.toString(ZipCoder.java:60)
    at java.base/java.util.zip.ZipCoder.toString(ZipCoder.java:87)
    at java.base/java.util.zip.ZipInputStream.readLOC(ZipInputStream.java:303)
    at java.base/java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:125)
    at com.mergebase.log4j.Log4JDetector.findLog4jRecursive(Log4JDetector.java:291)
    at com.mergebase.log4j.Log4JDetector.findLog4jRecursive(Log4JDetector.java:372)
    at com.mergebase.log4j.Log4JDetector.scan(Log4JDetector.java:617)
    at com.mergebase.log4j.Log4JDetector.analyze(Log4JDetector.java:724)
    at com.mergebase.log4j.Log4JDetector.main(Log4JDetector.java:160)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
    ... 13 more
...
-- No vulnerable Log4J 2.x samples found in supplied paths: [cpython-3.9.2-1.5.5-linux-x86_64.jar]
-- Congratulations, the supplied paths are not vulnerable to CVE-2021-44228 or CVE-2021-45046 !  :-) 

A cursory google search seems to indicate that the error is related to file encoding, but that may not be the case.

Relevant code: https://github.com/mergebase/log4j-detector/blob/d8225c61862e4b816c5ad09de8be95ad49ae28fd/src/main/java/com/mergebase/log4j/Log4JDetector.java#L289-L294

rgmz commented 2 years ago

A cursory google search seems to indicate that the error is related to file encoding, but that may not be the case.

The archive in question does have a file with a non-ascii character in its name.

$ uchardet *
filename_without.txt: ASCII
filename_with_СoЖ.txt: ASCII
zip_cp437_header.zip: unknown

It seems like the fix for that is to either start with UTF-8 and try different encodings on java.nio.charset.MalformedInputException, or use something like ISO-8859-1 from the get-go. https://stackoverflow.com/a/26268235