Open pj4533 opened 1 year ago
Same issue with StableDiffusion 2.1 base, precompiled from HuggingFace, using SplitEinsum on all compute devices. Trace traps even feeding a previously generated image from text2image.
swift run StableDiffusionSample --image photo3.jpg "An infinite blackboard" --resource-path /Users/davidkorcak/Documents/applediffusion/models/coreml-stable-diffusion-2-1-base/split_einsum/compiled --compute-units all --seed 93 --output-path ./
Building for debugging...
Build complete! (0.12s)
Loading resources and creating pipeline
(Note: This can take a while the first time using these resources)
Sampling ...
StableDiffusion/Encoder.swift:96: Fatal error: Unexpectedly found nil while unwrapping an Optional value
[1] 28368 trace trap swift run StableDiffusionSample --image photo3.jpg --resource-path all 93
I stopped pursuing other aspect ratios for now. I just use 512x512 for processing (putting my image centered if it's 9:16), then uprezzing the output with a second pass, and cropping out the sides as a last step. Works good for image2image video processing, not so great for generative stuff as it starts to spread outside the 9:16 middle.
I tried backfeeding it a 512 x 512 previously generated image, it still trace traps. Seems like there is some issue in the encoder code. I tried printing out the shape of the encoded image directly and it seems to produce correct shape. It fails at line 96 of Encoder.swift
I converted a model using:
That should give me a model for 320x512 images. I verified text2Image works fine, and did give me a 320x512 output image. (I changed to cpuAndGPU, cause of the ORIGINAL restriction for other sizes)
However, if I give a 320x512 startingImage, to do image2image, I get a trace trap.
Did I miss something?