Closed Disty0 closed 11 months ago
I have ran into random failed to create engine errors with Torch 2.1. This PR isn't ready yet. Apparently i get these errors with Torch 2.0 too. Something else seems to be broken after i deleted the venv.
Reverting bundle-in MKL and DPCPP fixed Failed to create engine errors.
Pygmalion 6B works fine but now i am getting this error with Llama2 models: I am not sure if this is IPEX only or a Torch 2.1 compatibility issue on Kobold side.
File "/home/disty/Apps/AI/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/generation/logits_process.py", line 920, in _prepare_bias_variables
raise ValueError(
ValueError: The model vocabulary size is 32000, but the following tokens were being biased: [32000]
I couldn't able to get the new ipex.optimize_transformers working reliably so i reverted it. ipex.optimize_transformers throws this error when generating:
File "/home/disty/Apps/AI/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/intel_extension_for_pytorch/transformers/models/xpu/optimize_transformers/modules/opt.py", line 206, in forward
attention_mask = attention_mask.view(
RuntimeError: shape '[1, 2, 1, 2, 2]' is invalid for input of size 4
In general this PR will be on hold until we have transitioned the CUDA side. Hopefully that will bring an answer to your question since your currently ahead of our own effort.
Transformers 4.34 and after seems to be broken with IPEX. 4.33.3 is working fine.
Can this help? https://github.com/intel/intel-extension-for-transformers
ITEX is for CPUs for now.
Adds PyTorch 2.1 support for IPEX. Fixes dtype errors if the GPU has 64 bit.
Also IPEX should have proper support for Windows now. But i don't really know how to do .bat scripting and i don't have any computer with Windows installed, so i can't really test it.