ryanpcmcquen / image-ultimator

:rice_scene: Make images amazing, recursively and automagically.
https://imgult.github.io
Mozilla Public License 2.0
54 stars 2 forks source link

Enhancement: Remember processed files #5

Closed Torkiliuz closed 8 years ago

Torkiliuz commented 8 years ago

Could you make it so that imgult remembers processed files, similarly to how rsync does? Mainly this is useful so that if the computer crashes during processing imgult can skip already processed files.

ryanpcmcquen commented 8 years ago

In some ways it already does. Most of the tools used by imgult blaze right through files they have already processed/optimized.

Try running it on the same file twice. The second run should be considerably faster.

Torkiliuz commented 8 years ago

At least in my testing with a library consisting of thumbnails and posters for 26 TB worth of movies/TV shows, my server has a big problem surviving with all the processes spawning, even when I change the nice level to 19 :wink: That's the reason it would be nice if it created a imgult-processed.txt-file, so that it could diff imgult-files.txt with that and pick up from when the server crashes, Just a thought though :wink:

ryanpcmcquen commented 8 years ago

I like what you're saying. It will be a bit tricky because several tools process the file (not just one), but I may have an idea.

I am curious though, what kind of server and what version of everything are you running?

jpegoptim
mozjpeg
optipng
pngquant
gifsicle
exiv2
svgo
Torkiliuz commented 8 years ago

Linux 3.19.0-56-generic #62~14.04.1-Ubuntu x86_64 GNU/Linux

jpegoptim v1.3.0 x86_64-pc-linux-gnu mozjpeg version 3.1 (build 20150904) OptiPNG 0.6.4: Advanced PNG optimizer. pngquant 2.0.1 (September 2013) LCDF Gifsicle 1.78 exiv2 0.23 001700 (64 bit build) svgo 0.6.2

ryanpcmcquen commented 8 years ago

Will you give this a run?

WARNING: THIS VERSION IS UNTESTED, IT MAY EXPLODE.

https://github.com/ryanpcmcquen/image-ultimator/blob/diffProcessedFiles/imgult

Torkiliuz commented 8 years ago

It runs and completes, but the second run still seems to run through all of them again

Torkiliuz commented 8 years ago

you're doing the grep, but not sending that to anything, maybe that is the problem? I think you need a third file that you write that grep to, something like imgult-notprocessedfiles.txt?

ryanpcmcquen commented 8 years ago

Try this one:

https://github.com/ryanpcmcquen/image-ultimator/blob/diffProcessedFiles/imgult

Torkiliuz commented 8 years ago

The grep takes a while, but it works :+1: I think an echo with "matching already processed files" or something could be a nice addition, but I'm just really happy you took the time to make this work!

ryanpcmcquen commented 8 years ago

Great idea! Thanks so much for testing and the suggestion. :^)

ryanpcmcquen commented 8 years ago

Would you mind if I mentioned your use case in the README for the 4.0.00 release? 26TB is quite the testing ground.

ryanpcmcquen commented 8 years ago

Also, will you give it one final run?

https://github.com/ryanpcmcquen/image-ultimator/blob/diffProcessedFiles/imgult

I did a little cleanup, with your go-ahead I will release this as the new version. :smiley_cat:

Torkiliuz commented 8 years ago

I'd be honored :smile: To clarify, the image-files are far smaller than 26 TB in size, but it's posters, albumart etc. for 26 TB of media. The size of the posters and albumart is roughly 140 GB :wink:

ryanpcmcquen commented 8 years ago

Still that's quite the testing ground. I am surprised imgult actually parses through all that. And 140GB is a huge amount of images, how many times do you have to run imgult to successfully complete all that?

Torkiliuz commented 8 years ago

Tested this in the same folder as earlier, but noticed this now: OpenSans-BoldItalic-webfont.svg: The file contains data of an unknown image type
I have svgo installed, and it should work in theory, but I'm not certain, as it is a "font-svg". PNG's go through normal as before though. I also get this: spinner.gif: Writing to GIF images is not supported This is probably just because it's an animated gif?

It seems good to go right now.
I have been running imgult for about 10 times now with the old version, but it was kind of pointless, since it would start at the beginning each time. With this version I should at least be able to get all images optimized after 10 runs or so :smile:

ryanpcmcquen commented 8 years ago

That's amazing! The GIF message is actually from exiv2, it doesn't currently support gifs but I bet it will in the future, so I just process them for now (the warning is innocuous).

You may want to bring up that specific file with the svgo people, it would probably be helpful for them, or they may have an idea what is going on. I would like to know as well. :smiley:

ryanpcmcquen commented 8 years ago

Let me know if there is anything else I can do here, and thanks again for the report!

The release is live!

https://github.com/ryanpcmcquen/image-ultimator/releases

Torkiliuz commented 8 years ago

Figured out a faster way to do the comparison, by using comm. From my testing it seemed to work, although it's getting hard to tell by now :stuck_out_tongue_winking_eye:
nice -n15 comm -13 --nocheck-order ${IMGULT_PROCESSED_FILES_LIST} ${IMGULT_FILES_LIST} > ${IMGULT_TEMP_FILES_LIST}

ryanpcmcquen commented 8 years ago

How much faster is it?

Torkiliuz commented 8 years ago

If it still works as it should, which is now hard to determine, it runs through the list so fast it doesn't even show up in htop. I might have made a typo, so I'll try to get a better test setup, so that I can be sure that it works. I get this at the end, but it still seems to run through the files:
/usr/local/bin/imgultnew: 0: /usr/local/bin/imgultnew: Cannot fork

Torkiliuz commented 8 years ago

Tested it now, it does not work, it even stopped working on the first run :disappointed: I didn't see it work correctly earlier either, it just looked like it worked, really sorry about that. Tested now with the Kodak png images, but they are still at the same size...

ryanpcmcquen commented 8 years ago

The new version works though, correct?

Torkiliuz commented 8 years ago

No, it seems to not work, now that I finally got a stable test-folder :cry:

The only output i get now:

File 1/1: ./kodim20.png
File 1/1: ./kodim23.png
File 1/1: ./kodim15.png
File 1/1: ./kodim17.png
File 1/1: ./kodim16.png
File 1/1: ./kodim01.png
File 1/1: ./kodim14.png
File 1/1: ./kodim02.png
File 1/1: ./kodim05.png
File 1/1: ./kodim22.png
File 1/1: ./kodim07.png
File 1/1: ./kodim21.png
File 1/1: ./kodim13.png
File 1/1: ./kodim24.png
File 1/1: ./kodim19.png
File 1/1: ./kodim03.png
File 1/1: ./kodim12.png
File 1/1: ./kodim06.png
File 1/1: ./kodim11.png
File 1/1: ./kodim08.png
File 1/1: ./kodim09.png
File 1/1: ./kodim18.png
File 1/1: ./kodim10.png
File 1/1: ./kodim04.png

****************************************************************************** 
                 ___           ___           ___           ___       ___      
     ___        /\__\         /\  \         /\__\         /\__\     /\  \     
    /\  \      /::|  |       /::\  \       /:/  /        /:/  /     \:\  \    
    \:\  \    /:|:|  |      /:/\:\  \     /:/  /        /:/  /       \:\  \   
    /::\__\  /:/|:|__|__   /:/  \:\  \   /:/  /  ___   /:/  /        /::\  \  
 __/:/\/__/ /:/ |::::\__\ /:/__/_\:\__\ /:/__/  /\__\ /:/__/        /:/\:\__\ 
/\/:/  /    \/__/--/:/  / \:\  /\ \/__/ \:\  \ /:/  / \:\  \       /:/  \/__/ 
\::/__/           /:/  /   \:\ \:\__\    \:\  /:/  /   \:\  \     /:/  /      
 \:\__\          /:/  /     \:\/:/  /     \:\/:/  /     \:\  \    \/__/       
  \/__/         /:/  /       \::/  /       \::/  /       \:\__\               
                \/__/         \/__/         \/__/         \/__/               

****************************************************************************** 

* Execute parametric cleaning sequence: * 
removed ‘imgult-files.txt’

* The imgult has completed. Take care. * 
******************************************************************************
ryanpcmcquen commented 8 years ago

Would you try the master? I borked something, should work now. Also, if this does work we can test comm again. :+1:

Torkiliuz commented 8 years ago

The hotfix worked, I tried to change grep to comm, but that did not work though :smile:

ryanpcmcquen commented 8 years ago

How about this version? (with comm):

http://sprunge.us/JIZU

Torkiliuz commented 8 years ago

I have tested comm by creating two textfiles and listed some similar and different elements in them, and comm shows the result correctly.
But for some reason it doesn't work with this though :disappointed:
I'll do a manual test and write paths in the textfiles, I'm pretty sure that's where the problem occurs

Torkiliuz commented 8 years ago

When I do the manual steps it works, so for some reason it stops working when run from a script. Weird :confused:

ryanpcmcquen commented 8 years ago

This version works here:

https://raw.githubusercontent.com/ryanpcmcquen/image-ultimator/3abf3d9e04b8ac9fdd90e6ff05abb10441b4a119/imgult

Torkiliuz commented 8 years ago

Weird, it runs, but on the second run it doesn't skip any of the files :confused:

Torkiliuz commented 8 years ago
comm: file 1 is not in sorted order
comm: file 2 is not in sorted order

I think there might be a sort-function that needs to run first for it to work

ryanpcmcquen commented 8 years ago

Strange. It skips all the files here. What arguments are you sending to comm? Keep in mind that on non-Linux systems comm does not have the --nocheck-order, which may make using comm a showstopper.

Torkiliuz commented 8 years ago

It's just running comm -13 ${IMGULT_PROCESSED_FILES_LIST} ${IMGULT_FILES_LIST} > ${IMGULT_TEMP_FILES_LIST}

But it ends up with this:

comm: file 1 is not in sorted order
comm: file 2 is not in sorted order
File 1/1: ./kodim15.png
File 1/1: ./kodim11.png
File 1/1: ./kodim10.png
File 1/1: ./kodim13.png
File 1/1: ./kodim05.png
File 1/1: ./kodim19.png
File 1/1: ./kodim14.png
File 1/1: ./kodim07.png
File 1/1: ./kodim08.png
File 1/1: ./kodim04.png
File 1/1: ./kodim12.png
File 1/1: ./kodim20.png
File 1/1: ./kodim06.png
File 1/1: ./kodim09.png
./kodim14.png:
./kodim08.png:
./kodim13.png:
./kodim05.png:
./kodim12.png:
./kodim07.png:
./kodim06.png:
./kodim20.png:
./kodim15.png:
./kodim11.png:
./kodim19.png:
./kodim09.png:
./kodim04.png:
./kodim10.png:

There might be a way around it by running a sort-function, so that would solve --nocheck-order not being available on other systems. Not sure how much more I can test today, it's getting late, thanks so much for what you've done so far at least :smiley:

ryanpcmcquen commented 8 years ago

By the time we write a sort function, we probably will not save any time over just using the grep one-liner we have now. If you do find it is still faster, I would be happy to incorporate the change.

I will keep the commTest branch open for now.

Torkiliuz commented 8 years ago

Ubuntu 14.04 set up like the following runs 4.0.01 on the whole drive without making the server crash:

jpegoptim v1.3.0 x86_64-pc-linux-gnu (from normal repositories) mozjpeg version 3.1 (build 20150904) (from here) OptiPNG 0.6.4: Advanced PNG optimizer. (from normal repositories) pngquant 2.3.0 (July 2014) (from here) LCDF Gifsicle 1.78 (from normal repositories) exiv2 0.23 001700 (64 bit build) (from normal repositories) svgo 0.6.2 (sudo npm install -g svgo)

ryanpcmcquen commented 8 years ago

That's amazing!!! Thank you so much for your help. Does that mean we can close this issue?

Torkiliuz commented 8 years ago

Yes, the issue can be closed. It might be a good idea to write the versions you need as requirements though :smile: As it is now it's kind of confusing because the default versions in at least Ubuntu are not the best ones :stuck_out_tongue_winking_eye:

ryanpcmcquen commented 8 years ago

That is a good idea! I use Slackware so I have more current versions of all the tools ... luckily when Ubuntu 16.04 gets released people will have much newer versions of everything by default.