aszc-dev / ComfyUI-CoreMLSuite

A set of custom nodes for ComfyUI that allow you to use Core ML models in your ComfyUI workflows.
GNU General Public License v3.0
131 stars 12 forks source link

CoreML ANE vs MPS? #15

Closed cchance27 closed 12 months ago

cchance27 commented 1 year ago

Just wondering have you ran any comparisons between running against the MPS (metal) vs using the ANE for inference?

aszc-dev commented 1 year ago

These are great questions. ANE is faster than MPS. It doesn't seem like much when compared to most CUDA builds, but it is a significant difference - 50-100% faster from my experience. It depends on the machine, and I would happily see other people's results. Speaking of which, I suppose some benchmark data in readme could be useful. I can't remember the exact details, but I think GPU Core ML is marginally faster than MPS, but that could also depend on the machine. CPU+GPU+ANE is supported with ALL option. GPU+ANE - I'm not really sure, but all options require CPU, so perhaps you can't run ANE+GPU without CPU. BTW, in this particular use case, CPU_AND_ANE seems to be much faster than ALL.

cchance27 commented 1 year ago

I was sitting here thinking and i wonder if the ALL being slightly slower might be a memory bandwidth issue depending on the GPU, as i know for instance my M3 pro has 150 memory bandwidth down form the previous m2 pro, the only ones that get really big are the max cpus

Ya benchmarks in readme would be nice for the various options.

cchance27 commented 1 year ago

Tested on M3 MBP 32gb with 512x512 SD1.5 model, using the convertor method of rendering from a safe tensor

Getting ~2.55 on ANE with EIPSUM (EIPSUM2 seems the same basically) Getting ~2.32 on GPU with ORIGINAL Getting ~1.48 on ALL with EIPSUM Getting ~2.06 on ALL with ORIGINAL

So ya, ANE is best with EIPSUM.... however I noticed something odd... when I bump resolution to 768x512.... 1.04 ANE with EIPSUM 1.32 GPU with ORIGINAL

So for the smaller image we're seeing ANE at ~10% faster, but when the size is larger suddenly the GPU is 10% faster, it's also overall 2x slower for the 50% larger size

cchance27 commented 12 months ago

Already clarified will close this off since it was discussed and your working on benchmarks elsewhere #18