danielgtaylor / jpeg-archive

Utilities for archiving JPEGs for long term storage.
1.16k stars 121 forks source link

Would you like a very simple jpegcheck script? #96

Open sourcejedi opened 5 years ago

sourcejedi commented 5 years ago

I think it would be good to have a simple jpegcheck script, given the "-archive" in the name of this project. Would you like me to send one?

find "$@" \( -iname '*.jpg' -o -iname "*.jpeg" \) -print0 |
    parallel -0 -m jpeginfo --check 2>&1 |
    grep -v '\[OK\]$'

motivation

The intent is to flag up if jpegs start suffering obvious corruption from storage faults or crashes. You can scan for bad jpegs on your online storage, before you overwrite the old backup that has the old non-corrupted version of the file 8-). And of course backups should include strong checksums, but users may also have created replicas by simple file copy. Sadly, filesystems which do checksums for you automatically are still not a commodity, and dm-integrity has a significant performance cost.

technical limitations

I can unify the above to best match the style / behaviour of jpeg-archive. However, I plan not to return a different exit code on failures. AFAICT Ladon does not change exit code on failures, even if you pass --fail (which I don't want anyway; I want to show all jpegs which have warnings).

Also it doesn't like newlines in filenames / paths. If you include newlines in filenames, you are being horrible and my jpegcheck will probably tell you there is some error by cryptically printing the first line of jpeginfo --check output.

why jpeginfo --check

I used to use a fancier script but this works well for JPGs. jpeginfo --check is often recommended. It certainly seems to be fast compared the alternatives. And at least it seems more appropriate than relying on an undocumented implementation detail of the -verbose flag to identify!

So far I haven't seen it miss any case that djpeg -fast -grayscale -onepass would have picked up (and that command is maybe 33% slower and only accepts one image at a time). I have searched for reports complaining that jpeginfo overlooks some error, and not found anything.