mergebase / log4j-detector

A public open sourced tool. Log4J scanner that detects vulnerable Log4J versions (CVE-2021-44228, CVE-2021-45046, etc) on your file-system within any application. It is able to even find Log4J instances that are hidden several layers deep. Works on Linux, Windows, and Mac, and everywhere else Java runs, too! TAG_OS_TOOL, OWNER_KELLY, DC_PUBLIC
Other
638 stars 98 forks source link

Read files to scan from stdin to use `find` for excluding of files, folders and mount points #42

Closed beckerr-rzht closed 2 years ago

beckerr-rzht commented 2 years ago

It would be great if the files to be scanned could be read from stdin. This would open up a whole new set of possibilities together with find.

Example:

find / -xdev -type f | java -jar log4j-detector-2021.12.16.jar --stdin

This would scan all files in the local root filesystem, but omit /dev, /proc, etc. and all NFS mounts.

Using find, the following issues would be easy to solve: #11, #39 and #40,

zhurkin commented 2 years ago

The find on large volumes just freezes . It is better to make an explicit exception in the program

beckerr-rzht commented 2 years ago

I don't know such problems with find, but I just want to scan files on all local filesystems only.

For example I'm actually using this find options:

find  / \( -type d \( -fstype autofs -o -fstype fuse.sshfs -o -fstype nfs -o -fstype proc -o -fstype sshfs -o -fstype sysfs -o -fstype tmpfs \) -prune -o -type f \) \
    -type f -print | java -jar log4j-detector-2021.12.17.jar --stdin
beckerr-rzht commented 2 years ago

The current precompiled version 2021.12.17 supporting --stdin is here: https://github.com/beckerr-rzht/log4j-detector/raw/master/log4j-detector-2021.12.17.jar

juergenhoetzel commented 2 years ago

You can build and execute command lines from standard input using xargs:

find  / \( -type d \( -fstype autofs -o -fstype fuse.sshfs -o -fstype nfs -o -fstype proc -o -fstype sshfs -o -fstype sysfs -o -fstype tmpfs \) -prune -o -type f \)     -type f -name "*.jar"|xargs java -jar log4j-detector-2021.12.17.jar
beckerr-rzht commented 2 years ago

Note the following when using xargs: Using xargs can always be slower if many files are passed, because the java process may have to be started several times.

When using xargs, parameters and environment variables together may only occupy a maximum of 4096 bytes in the worst case. The size of the environment of root is around 2000 bytes (depending on operating system and configuration). A "medium" installation of Ubuntu Desktop has about 400000 files.

This would result in the following comparison:

But this is of course only the worst case, which should occur rarely. The actual values of the particular system are provided by xargs --show-limits.

But xargs has one advantage in any case: The parameter -P allows to run several processes in parallel. So e.g.:

find \ -xdev | xargs -rn100 -P8 java -jar log4j-detector-2021.12.17.jar

... will start 8 processes scanning in parallel. Here -r prevents the process from being started without parameters and -n100 determines that 100 arguments are passed at a time.

Provided you have enough CPU, this could speed up the detector scan. However, in such cases the tool parallel should be preferred, because it is much more flexible.

Regardless, I hope that my pull request #43 will be accepted.

juliusmusseau commented 2 years ago

I did this in my own way. See v2021.12.20 which adds a new --stdin flag.