intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Apache License 2.0
2.14k stars 211 forks source link

failed to create the serving #1392

Closed RongLei-intel closed 8 months ago

RongLei-intel commented 8 months ago

I tried to create the serving on my system, but failed with the below error: (emon_analyzer) [root@SPR-1 emon_data_analyzer]# neuralchat_server start --config_file ./config/neuralchat.yaml 2024-03-19 11:38:57,005 - numexpr.utils - INFO - Note: detected 224 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable. 2024-03-19 11:38:57,005 - numexpr.utils - INFO - Note: NumExpr detected 224 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. 2024-03-19 11:38:57,005 - numexpr.utils - INFO - NumExpr defaulting to 8 threads. 2024-03-19 11:38:57,348 - datasets - INFO - PyTorch version 2.2.0+cpu available. [2024-03-19 11:38:57,430] [ ERROR] - Failed to start server. [2024-03-19 11:38:57,430] [ ERROR] - partially initialized module 'intel_extension_for_pytorch' has no attribute '_C' (most likely due to a circular import)

yaml config file:

host: 0.0.0.0 port: 8000

model_name_or_path: "Intel/neural-chat-7b-v3-1"

model_name_or_path: "/home/zluo2/TableLlama-model"

tokenizer_name_or_path: ""

peft_model_path: "./models/emon_llama"

device: "cpu"

asr: enable: false args:

support cpu, hpu, xpu, cuda

    device: "cpu"
    # support openai/whisper series
    model_name_or_path: "openai/whisper-small"
    # only can be set to true when the device is set to "cpu"
    bf16: false

tts: enable: false args: device: "cpu" voice: "default" stream_mode: false output_audio_path: "./output_audio.wav"

asr_chinese: enable: false

tts_chinese: enable: false args: device: "cpu" spk_id: 0 stream_mode: false output_audio_path: "./output_audio.wav"

retrieval: enable: true args: input_path: "./rag_data/emon-sample"

vector_database: "Qdrant"

    #retrieval_type: "bm25"

safety_checker: enable: false

ner: enable: false args: spacy_model: "en_core_web_lg"

tasks_list: ['textchat', 'retrieval']

lvliang-intel commented 8 months ago

@RongLei-intel, Please check your IPEX version. The trace shows that the IPEX version is mismatch with Pytorch.

PyTorch version 2.2.0

RongLei-intel commented 8 months ago

the issue is solved by install oneccl.