Open jdc4429 opened 1 month ago
For some reason it's now crashing even running SDXL through the API. I have downgraded torch back to 2.3.1 and xformers to .0.0.27 but it's still crashing. It always seems to crash after Requested to load AutoencoderKL.
Error: model_type EPS Using xformers attention in VAE Using xformers attention in VAE Requested to load SDXLClipModel Loading 1 new model Requested to load SDXL Loading 1 new model 100%|█████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:21<00:00, 2.71s/it] Requested to load AutoencoderKL Loading 1 new model [START] Security scan [DONE] Security scan
I have seen the exact same issue experienced by two other people on Reddit in attempting to get Flux to work from the API.
https://www.reddit.com/r/comfyui/comments/1euaz54/comfyui_just_crashes_without_any_error_after_got/
The issue happens on torch 2.3.1 and torch 2.4.0 It happens whether or not I use xformers as well.
Can someone please look at this.. It's gotten so bad after updating that I can't even run SDXL without it constantly crashing when going through the API. I tried reinstalling from scratch with no extra nodes and it's still crashing almost every time! No error message, it's like it's seg faulting.
Turned this off in case it's affecting.. regarding shared memory. Flux still crashing...
got prompt model weight dtype torch.float8_e4m3fn, manual cast: torch.float16 modeltype FLOW Using pytorch attention in VAE Using pytorch attention in VAE Requested to load FluxClipModel Loading 1 new model loaded partially 5873.075 5852.23095703125 0 Unloading models for lowram load. 0 models unloaded. Requested to load Flux Loading 1 new model loaded partially 5873.07451171875 5866.974670410156 0 0%| | 0/8 [00:00<?, ?it/s][START] Security scan [DONE] Security scan
Still crashing... Geting this error now since update. C:\inetpub\wwwroot\ComfyUI\python_embeded\Lib\site-packages\torchsde_brownian\brownian_interval.py:608: UserWarning: Should have tb<=t1 but got tb=14.614640235900879 and t1=14.61464. warnings.warn(f"Should have {tb_name}<=t1 but got {tb_name}={tb} and t1={self._end}.")
For the record, I have a P40 (24gb) as cuda0 and RTX 2070 (8gb) as cuda1. I have tried 0.0.4 to 0.0.8 and they all are having the same issue of seg faulting from memory issues it seems. I have installed from scratch 3 times now. I have tried with/without --lowvram. I have to use --disable-cuda-malloc for the P40 to work. Another thing I noticed is that when it suceeds from the web interface, the memory allocation is like "loaded completely 11810.32939453125 6462.797119140625" but when I go through the API, it's only like 5K for both numbers.. but it should not be different!
After the latest update it seems only generating 1080p and 2160p images (And flux api) is causing a crash. I have 24 and 8 gb or vram so it should not be an issue and for SDXL it was working fine generating up to 2160p until the Flux stuff started to get added. 720p and 1440p are working and so far not appearing to crash the comfyui..
Still crashing with latest update when attempting to run SDXL lightning now at 1920x1080.. No error displayed.. 8gb and 24gb cards
Loading 1 new model loaded completely 0.0 4897.0483474731445 True 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:19<00:00, 2.45s/it] Unloading models for lowram load. 1 models unloaded. Loading 1 new model loaded completely 0.0 319.11416244506836 True [START] Security scan
Still crashing after latest update.. says memory issues are fixed.. Not.. Still can't even do SDXL 1920x1080... still crashing without error.. I would suspect from memory issues. Same issues with memory errors for all releases from 0.0.4 to 0.1.2
Still crashing in latest updates.. Updated 8/29...
Still crashing in latest updates.. 0.2.0
Still unable to run SDXL in 1920x1080 resolution.. This used to work before Flux was introduced! Please FIX! Still crashing with no error!!!
Tried updating to 0.2.2 with new install.. still NOT WORKING correctly! Still crashing with no error message when attempting to do SDXL 1920x1080 or FLUX through the API!!! Also now ComfyUI-Inspyrenet-Rembg custom node is no longer working in 0.2.2
Expected Behavior
Image created
Actual Behavior
Error:
model weight dtype torch.float8_e4m3fn, manual cast: torch.float16 modeltype FLOW Using xformers attention in VAE Using xformers attention in VAE Requested to load FluxClipModel Loading 1 new model Requested to load Flux Loading 1 new model 100%|██████████████████████████████████████████████████████████████████████████████| 6/6 [01:32<00:00, 15.39s/it] Using xformers attention in VAE Using xformers attention in VAE Requested to load AutoencoderKL Loading 1 new model !!! Exception during processing!!! Given groups=1, weight of size [4, 4, 1, 1], expected input[1, 16, 160, 90] to have 4 channels, but got 16 channels instead Traceback (most recent call last): File "C:\inetpub\wwwroot\ComfyUI\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "C:\inetpub\wwwroot\ComfyUI\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "C:\inetpub\wwwroot\ComfyUI\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) File "C:\inetpub\wwwroot\ComfyUI\ComfyUI\nodes.py", line 270, in decode return (vae.decode(samples["samples"]), ) File "C:\inetpub\wwwroot\ComfyUI\ComfyUI\comfy\sd.py", line 322, in decode pixel_samples[x:x+batch_number] = self.process_output(self.first_stage_model.decode(samples).to(self.output_device).float()) File "C:\inetpub\wwwroot\ComfyUI\ComfyUI\comfy\ldm\models\autoencoder.py", line 199, in decode dec = self.post_quant_conv(z) File "C:\inetpub\wwwroot\ComfyUI\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "C:\inetpub\wwwroot\ComfyUI\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl return forward_call(args, kwargs) File "C:\inetpub\wwwroot\ComfyUI\ComfyUI\comfy\ops.py", line 93, in forward return super().forward(*args, **kwargs) File "C:\inetpub\wwwroot\ComfyUI\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 458, in forward return self._conv_forward(input, self.weight, self.bias) File "C:\inetpub\wwwroot\ComfyUI\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 454, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Given groups=1, weight of size [4, 4, 1, 1], expected input[1, 16, 160, 90] to have 4 channels, but got 16 channels instead
Prompt executed in 194.95 seconds
Steps to Reproduce
Run workflow... Flux full checkpoint API test.json
Debug Logs
Other
To confirm, I can run this workflow myself in the web interface without any issues. But when I copy the workflow to an API I get the above error. Some how the vae does not appear to be matching I believe.. 16 channels instead of 4.. I don't understand how this works in the web interface but not through the API.