Open Malik-Hacini opened 3 months ago
Thanks for reporting this issue! Are you trying to run inference on CPU by any chance?
I have never ran any model on my machine before, and haven't touched any settings.
Do you have a GPU on your machine? If yes, what is the output of import torch; print(torch.cuda.is_available())
? The webui always tries to load models in half precision to save some memory but this is not supported for CPU inference. If you remove the following line it should work also on a cpu:
But I would expect this to be prohibitively slow. If it is usable, however, I am more than happy to implement CPU support.
I do have an RTX 2050 on my machine (laptop). The output of cuda.is_available() is False. What should I do to get it to true ?
Alright I fixed my pytorch w/ CUDA installation. I have another issue, should I keep going here or create antoher one ?
Feel free to create a new one. Be aware, however, that the VRAM of your RTX 2050 will likely not suffice to run the 7b models, for them you would need about 20gb.
When running the tool from webui, it crashes after loading the shard.
Traceback :
File "C:\Users\juioi\Desktop\Detikzify.venv\Lib\site-packages\torch\nn\functional.py", line 2546, in layer_norm return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'