intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.27k stars 1.23k forks source link

LLM CPU Docker lack transformers_stream_generator einops tiktoken Qwen #9581

Open qiyuangong opened 7 months ago

qiyuangong commented 7 months ago

Transform int4 cannot find transformers_stream_generator einops tiktoken

pip install transformers_stream_generator einops tiktoken
Zhengjin-Wang commented 7 months ago

Currently the problem is found in Qwen-7b-chat and Qwen-14b-chat, while it works well on Chatglm2-6b and Llama2-7b. Looks like Qwen depends some other packages. The problem can be solved by running pip install transformers_stream_generator einops tiktoken.