Open spammeduh opened 1 month ago
Is it possible to use the bf16 version? https://huggingface.co/ybelkada/flan-t5-xl-sharded-bf16/tree/main
ELLA only need Flan-T5 XL Encoder(fp16), ~2.8G
ELLA only need Flan-T5 XL Encoder(fp16), ~2.8G
Is this available for download anywhere? Edit: https://huggingface.co/limcheekin/flan-t5-xl-ct2/tree/main Is this it? The size matches, but it doesn't seem to work with this node.
Couldn't find the encoder only, so did that myself: https://huggingface.co/Kijai/flan-t5-xl-encoder-only-bf16/tree/main with original:
with pruned:
saves quite a bit of space indeed:
@kijai your node seems to work better than this node, but seems less native to ComfyUI. Why does your node work better and am I correct in my assessment that this node is more Comfy-ish in its implementation?
@kijai your node seems to work better than this node, but seems less native to ComfyUI. Why does your node work better and am I correct in my assessment that this node is more Comfy-ish in its implementation?
Mine if just a wrapper for the original code, so not compatible with anything in Comfy, while this node here only creates the embeds and passes them to Comfy just like CLIP text encode would, making it native.
I too have noticed this isn't working like the original, I think something in how ComfyUI handles the conditioning is causing the difference.
@kijai Thanks for creating the smaller model. Some people's cards (mine included) don't support bf16, so could you also do fp16 and/or briefly explain how you did it? I'm actually quite curious about the process used.
I first tried to reproduce your bf16 version with the code I whipped up below, but the file size doesn't match.
import torch
part1 = torch.load('pytorch_model-00001-of-00002.bin', map_location='cuda:0')
part2 = torch.load('pytorch_model-00002-of-00002.bin', map_location='cuda:0')
combined_model = {**part1, **part2}
encoder_blocks = {key: val for key, val in combined_model.items() if key.startswith('encoder.')}
encoder_blocks_bf16 = {key: val.bfloat16() for key, val in encoder_blocks.items()}
torch.save(encoder_blocks_bf16, 'model_bf16.bin')
@kijai Thanks for creating the smaller model. Some people's cards (mine included) don't support bf16, so could you also do fp16 and/or briefly explain how you did it? I'm actually quite curious about the process used.
I first tried to reproduce your bf16 version with the code I whipped up below, but the file size doesn't match.
import torch part1 = torch.load('pytorch_model-00001-of-00002.bin', map_location='cuda:0') part2 = torch.load('pytorch_model-00002-of-00002.bin', map_location='cuda:0') combined_model = {**part1, **part2} encoder_blocks = {key: val for key, val in combined_model.items() if key.startswith('encoder.')} encoder_blocks_bf16 = {key: val.bfloat16() for key, val in encoder_blocks.items()} torch.save(encoder_blocks_bf16, 'model_bf16.bin')
I was under the impression it would still cast the weights to fp16 for inference, you tried it and it didn't work?
I did it pretty lazily, I just inserted this saving in the original transformers loading code:
Then rename that to model.safetensors and include the original configs in the repo.
@kijai Thanks for the info. And yeah, the bf16 one did end up working, which I was surprised by because other bf16 models failed before.
There's multiple 9GB+ files on that repo and it is using a good chunk of space on my hard drive, would it be enough with the safetensors?