henk717 / KoboldAI

KoboldAI is generative AI software optimized for fictional use, but capable of much more!
http://koboldai.com
GNU Affero General Public License v3.0
371 stars 134 forks source link

IPEX Torch 2.1 support and upgrade to Python 3.10 #488

Closed Disty0 closed 11 months ago

Disty0 commented 11 months ago

Adds PyTorch 2.1 support for IPEX. Fixes dtype errors if the GPU has 64 bit.

Also IPEX should have proper support for Windows now. But i don't really know how to do .bat scripting and i don't have any computer with Windows installed, so i can't really test it.

Disty0 commented 11 months ago

I have ran into random failed to create engine errors with Torch 2.1. This PR isn't ready yet. Apparently i get these errors with Torch 2.0 too. Something else seems to be broken after i deleted the venv.

Disty0 commented 11 months ago

Reverting bundle-in MKL and DPCPP fixed Failed to create engine errors.

Pygmalion 6B works fine but now i am getting this error with Llama2 models: I am not sure if this is IPEX only or a Torch 2.1 compatibility issue on Kobold side.

File "/home/disty/Apps/AI/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/generation/logits_process.py", line 920, in _prepare_bias_variables
    raise ValueError(
ValueError: The model vocabulary size is 32000, but the following tokens were being biased: [32000]

I couldn't able to get the new ipex.optimize_transformers working reliably so i reverted it. ipex.optimize_transformers throws this error when generating:

File "/home/disty/Apps/AI/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/intel_extension_for_pytorch/transformers/models/xpu/optimize_transformers/modules/opt.py", line 206, in forward
    attention_mask = attention_mask.view(
RuntimeError: shape '[1, 2, 1, 2, 2]' is invalid for input of size 4
henk717 commented 11 months ago

In general this PR will be on hold until we have transitioned the CUDA side. Hopefully that will bring an answer to your question since your currently ahead of our own effort.

Disty0 commented 11 months ago

Transformers 4.34 and after seems to be broken with IPEX. 4.33.3 is working fine.

henk717 commented 11 months ago

Can this help? https://github.com/intel/intel-extension-for-transformers

Disty0 commented 11 months ago

ITEX is for CPUs for now.