Closed RossM closed 8 months ago
Fixing a bug in the EXIF handling lost most of the performance gain. I'll look for a better way to do this.
I'm now loading the image size directly from the cached latents, rather than needing to open the original file. This reduces preprocessing for a particular large dataset from almost 2 hours to under 15 minutes.
This required moving loading the cached latents earlier in the process, and plumbing that through. I tried to keep as clean as I could, but the result is still pretty ugly. Please let me know if you'd like any refactoring.
Describe your changes
Pillow (PIL) supports lazy loading of images, where the actual image data is only loaded when needed. Currently
image_utils.get_dim()
loads the image data to handle EXIF rotations. There's no need to do that, we can just swap width and height if necessary. This gives a ~4x speedup in preload when the latents are already cached.Issue ticket number and link (if applicable)
Checklist before requesting a review