Closed beckerr-rzht closed 2 years ago
The find on large volumes just freezes . It is better to make an explicit exception in the program
I don't know such problems with find
, but I just want to scan files on all local filesystems only.
For example I'm actually using this find
options:
find / \( -type d \( -fstype autofs -o -fstype fuse.sshfs -o -fstype nfs -o -fstype proc -o -fstype sshfs -o -fstype sysfs -o -fstype tmpfs \) -prune -o -type f \) \
-type f -print | java -jar log4j-detector-2021.12.17.jar --stdin
The current precompiled version 2021.12.17 supporting --stdin
is here:
https://github.com/beckerr-rzht/log4j-detector/raw/master/log4j-detector-2021.12.17.jar
You can build and execute command lines from standard input using xargs
:
find / \( -type d \( -fstype autofs -o -fstype fuse.sshfs -o -fstype nfs -o -fstype proc -o -fstype sshfs -o -fstype sysfs -o -fstype tmpfs \) -prune -o -type f \) -type f -name "*.jar"|xargs java -jar log4j-detector-2021.12.17.jar
Note the following when using xargs
:
Using xargs
can always be slower if many files are passed, because the java process may have to be started several times.
When using xargs
, parameters and environment variables together may only occupy a maximum of 4096 bytes in the worst case. The size of the environment of root
is around 2000 bytes (depending on operating system and configuration).
A "medium" installation of Ubuntu Desktop has about 400000 files.
This would result in the following comparison:
--stdin
the java process is started exactly once.--stdin
xargs
starts the java process about 10000 times. But this is of course only the worst case, which should occur rarely.
The actual values of the particular system are provided by xargs --show-limits
.
But xargs
has one advantage in any case:
The parameter -P
allows to run several processes in parallel.
So e.g.:
find \ -xdev | xargs -rn100 -P8 java -jar log4j-detector-2021.12.17.jar
... will start 8 processes scanning in parallel. Here -r
prevents the process from being started without parameters and -n100
determines that 100 arguments are passed at a time.
Provided you have enough CPU, this could speed up the detector scan.
However, in such cases the tool parallel
should be preferred, because it is much more flexible.
Regardless, I hope that my pull request #43 will be accepted.
I did this in my own way. See v2021.12.20 which adds a new --stdin flag.
It would be great if the files to be scanned could be read from stdin. This would open up a whole new set of possibilities together with
find
.Example:
This would scan all files in the local root filesystem, but omit /dev, /proc, etc. and all NFS mounts.
Using
find
, the following issues would be easy to solve: #11, #39 and #40,