Open rrueger opened 2 years ago
Edit: Accidentally pressed Enter
not Shift+Enter
for a new line, submitting the issue prematurely.
replay mode has called stat on paths since its inception in 2015, so the documentation is not accurate in its claim that replay "does no input/output". It does avoid recomputing file hashes, but it still reads file metadata from disk. And, quoting the docs:
Usage is simple: Just pass
--replay
on the second run, with other changed to the new formatters or filters. Pass the.json
files of the previous runs additionally to the paths you ranrmlint
on.
Do you think the docs should be clarified? Would it be more convenient if --replay processed all files by default instead of none of them?
I recently ran
rmlint -c sh:clone -o sh:rmlint.sh
on a large subvolume.Upon inspection, I saw that there are many very small files that would need to be cloned.
Using the
--replay
functionality, I would like to be able to create a newrmlint-10M+.sh
file, that only clones files greater in size than 10M.Since the manual states that
--replay
doesn't perform any disk i/o, I would expect thatis a legal command that would take the files from
rmlint.json
and create a script that only clones only files with sizes greater (or equal to) 10M.Instead,
rmlint
produces an empty*rmlint-10M+.sh
script.It looks like I need to also specify the path along with the
rmlint ... --replay
call. According toglances
, doing this produces a non-trivial (~1MB/s) amount of i/o long into execution (so likely not just reading thejson
file).*Empty, as in, the autogenerated component is empty. All the handler definitions are still there.