pythongosssss / ComfyUI-Custom-Scripts

Enhancements & experiments for ComfyUI, mostly focusing on UI features
MIT License
1.73k stars 134 forks source link

image feed deduplication #204

Closed choey closed 5 months ago

choey commented 6 months ago

(Thank you for your work!)

Problem Some nodes can generate the same image for consecutive generations, which clutters the image feed with duplicate images:

image

There are two categories of duplicate images to consider, here:

  1. Two images point to the same file on server.
  2. Two images look the same but are stored differently on server, potentially with unique metadata.

Solution

  1. Memoize the image src metadata (filename/type/subfolder) and hash (sans metadata).
  2. Prevent addition to the feed if we've seen either the metadata or the hash before.

Performance image (tested on Ryzen 9 3950X w/ 64 MB RAM, running on local instance of comfy)

~69% of the performance hit is due the additional drawing on an invisible canvas, in order to get image data without the metadata. For workflows yielding duplicates, the cost is offset by the cost-saving from not adding duplicate images to the feed (which would require canvas renders). An optimization idea for the future is to strip the metadata section of the original image (ref: PNG eXIf spec), which must be a faster operation.

To opt in to this feature, turn it on in settings (off by default): image

pythongosssss commented 6 months ago

Hey thanks for this, it does seem like quite a hit for removing duplicates (I know you've made it an option!) Would it solve your problem to instead provide an option to exclude certain nodes from outputting images to the feed?

choey commented 6 months ago

Hi--thanks for the response. Excluding specific node instances (as opposed to node types) would be a good alternative, though ideally it would not require user interaction each time workflow changes to produce duplicate outputs.

Take this, for example: image

Today, this results in a duplicate image, because the preview node and the corresponding ksampler node output the same image. It would be nice to be able to disable image feed for the second ksampler (but not the first). However, once I add another preview node for the first ksampler, I'd need to exclude another node (either the first ksampler or preview).

What do you think about the optimization to strip metadata by modifying the metadata bytes, instead of re-drawing it to achieve the same? I'll give that a try if you think it'd make the automated deduplication more reasonable in terms of performance.

pythongosssss commented 6 months ago

Looks like if you run your hash on the ImageData instead of the base64 representation, it is siginficantly faster:

image

This is on a 512x512 then 2048x2048 set of images

const start = performance.now();
const data = imgContext.getImageData(0, 0, img.width, img.height);
const p = 16777619;
let hash = 2166136261;
for(const b of data.data) {
    hash = (hash ^ b) * p;
}
console.log(`fast hashing took ${performance.now() - start} milliseconds.`);

This is using FNV hash, but im not well versed in what the best hashing algo would be for this.

You could also change the setting to be something like disabled/performance/accuracy, and for performance downscale the image before hashing?

choey commented 6 months ago

Great suggestions. Looks like avoiding the b64 conversion does indeed significantly improve performance:

... (1024x1024) fetching image took 31 milliseconds 
... (1024x1024) drawing took 25 milliseconds 
... (1024x1024) getting data took 5 milliseconds 
... (1024x1024) fast hashing took 38 milliseconds: 504217953444 
... (1024x1024) slow hashing took 201 milliseconds: 806040840 

... (1024x1024) fetching image took 330 milliseconds 
... (1024x1024) drawing took 23 milliseconds 
... (1024x1024) getting data took 5 milliseconds 
... (1024x1024) fast hashing took 35 milliseconds: 1914780944201 
... (1024x1024) slow hashing took 204 milliseconds: 1347634566 

... (2048x2048) fetching image took 705 milliseconds 
... (2048x2048) drawing took 82 milliseconds 
... (2048x2048) getting data took 29 milliseconds 
... (2048x2048) fast hashing took 131 milliseconds: -607648103031 
... (2048x2048) slow hashing took 884 milliseconds: 28733819 

... (2048x2048) fetching image took 819 milliseconds 
... (2048x2048) drawing took 88 milliseconds 
... (2048x2048) getting data took 27 milliseconds 
... (2048x2048) fast hashing took 128 milliseconds: -4835469135051 
... (2048x2048) slow hashing took 878 milliseconds: 881993976 

Scaling the image down to 25% before hashing improves performance somewhat, but of course, the image fetch is the bottleneck:

... (1024x1024) fetching image took 850 milliseconds 
... (256x256) drawing took 21 milliseconds 
... (256x256) getting data took 4 milliseconds 
... (256x256) fast hashing took 3 milliseconds: 117165645653 
... (256x256) slow hashing took 14 milliseconds: 282783959

... (2048x2048) fetching image took 822 milliseconds 
... (512x512) drawing took 85 milliseconds 
... (512x512) getting data took 12 milliseconds 
... (512x512) fast hashing took 8 milliseconds: -555722103884 
... (512x512) slow hashing took 52 milliseconds: 945832820

However, this is just front-loading/caching the image that the image feed would load, so the time to fetch image does not factor into the overhead of deduplication.

image (disabled, 100%, 50%, 25% scaling)

Does this look good?

choey commented 6 months ago

final review plz @pythongosssss 🙏