xperseguers / t3ext-image_autoresize

TYPO3 Extension image_autoresize. Simplify the way your editors may upload their images.
https://extensions.typo3.org/extension/image_autoresize
GNU General Public License v3.0
16 stars 22 forks source link

Avoid reprocessing the same file over and over again #109

Open dmitryd opened 5 months ago

dmitryd commented 5 months ago

If the file passed the thereshold check, it will always be reprocessed again but the result will be discarded if the new size is the same or larger. It could save resources if the file was marked in some way to avoid reprocessing. The goal is to avoid running resizes on hundreds of files that still exceed the threshold. Executing ImageMafic is VERY resource consuming!

I can imagine it by using file metadata, a new field with a hash that is computed as the following:

sha1([
  $fileModificationDate,
  $fileSize,
  $imageWidth,
  $imageHeight,
  serialize(file_get_contents('....../image_autoresize.config.php'))
]);

When running, get the metadata, compute the hash and compare with the stored hash. If they match, do not attempt to resize it.

Possible question: why not fine tune the threashold? Answer: because the threshold is about file size and file size greatly depends on the content of the image. For example, a 3000x2000px solid white jpeg can be smaller in bytes than 1000x600px jpeg of the sea or city.

What do you think?

If you do not think it is a good idea, would you consider at least adding an event to the beginning of the ImageResizer::processFile() to let other extensions decide if the file should be processed or not?

xperseguers commented 5 months ago

What I miss in your description is the context in which you think (or figured out) that the file will be "reprocessed again".

If you do not think it is a good idea, would you consider at least adding an event to the beginning of the ImageResizer::processFile() to let other extensions decide if the file should be processed or not?

Regardless of whether I think it's a good idea, adding an event is basically no problem for me as extensibility in the heart spirit of TYPO3.

dmitryd commented 5 months ago

What I miss in your description is the context in which you think (or figured out) that the file will be "reprocessed again".

I set the threshold to 50K. I have a file, which is 100K in size.

Each time when I run the command, this line is executed on the same set of files:

$tempFileInfo = $gifCreator->imageMagickConvert($fileName, $destExtension, '', '', $imParams, '', $options, true);

So the file gets resized with ImageMagic on each scheduler execution taking CPU time.

Then there is this check:

        } elseif (!$isRotated && filesize($tempFileInfo[3]) >= $originalFileSize - 10240 && $destExtension === $fileExtension) {
            // Conversion leads to same or bigger file (rounded to 10KB to accommodate tiny variations in compression) => skip!
            @unlink($tempFileInfo[3]);
            $tempFileInfo = null;
        }

And the result of the resizing is discarded. What is interesting: it tries to scale to the same width/height, with each scheduler execution. The result is always discarded because the file was resized already before.

xperseguers commented 5 months ago

OK, so the context is that it runs over and over again for the same files when the scheduler task for batch processing is invoked.

dmitryd commented 5 months ago

Yes. Our editors can upload 6000x4000 huge images, so we need to run the task regularly. Manual execution is not an option because it is a closed system with no console and there are a lot of sites like this. Thus, the scheduler runs daily. It would be good to optimize the resizing process :)

xperseguers commented 5 months ago

and just to get it, why are you not resizing on the fly during upload? Your editors are pushing those files via FTP or some external system?

dmitryd commented 5 months ago

We already have many sites with a lot of files. The problem exists for quite soime time.

We could run the job once and rely on the resizing on the fly for new files. Is this what you suggest? 🤔

xperseguers commented 5 months ago

Yes this is what I suggest. In my experience, unless there is a misconfiguration of the GFX part and you don't see that quickly enough, or you have misconfigured how to resize, the resizing on-the-fly while uploading works fine. This makes the upload slightly slower, that's true, but that usually doesn't really bother anyone.

I typically run the batch processing only once and if I see that somehow the resizing was not properly configured or I had GFX problems and my editors just uploaded bunch of huge photos. But I really don't run that task on a daily basis as upload is only possible through the Backend (or through custom Frontend plugin but in that case, if you do it correctly, meta-data extraction, resizing and everything works fine as well).

dmitryd commented 5 months ago

Thank you! We will do it like this. 🙇

Please, feel free to close the ticket (or add an event using this ticket).