VincyZhang commented 8 months ago

I tried to create the serving on my system, but failed with the below error: (emon_analyzer) [root@SPR-1 emon_data_analyzer]# neuralchat_server start --config_file ./config/neuralchat.yaml 2024-03-19 11:38:57,005 - numexpr.utils - INFO - Note: detected 224 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable. 2024-03-19 11:38:57,005 - numexpr.utils - INFO - Note: NumExpr detected 224 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. 2024-03-19 11:38:57,005 - numexpr.utils - INFO - NumExpr defaulting to 8 threads. 2024-03-19 11:38:57,348 - datasets - INFO - PyTorch version 2.2.0+cpu available. [2024-03-19 11:38:57,430] [ ERROR] - Failed to start server. [2024-03-19 11:38:57,430] [ ERROR] - partially initialized module 'intel_extension_for_pytorch' has no attribute '_C' (most likely due to a circular import)

yaml config file:

host: 0.0.0.0 port: 8000

model_name_or_path: "Intel/neural-chat-7b-v3-1"

model_name_or_path: "/home/zluo2/TableLlama-model"

tokenizer_name_or_path: ""

peft_model_path: "./models/emon_llama"

device: "cpu"

asr: enable: false args:

support cpu, hpu, xpu, cuda

    device: "cpu"
    # support openai/whisper series
    model_name_or_path: "openai/whisper-small"
    # only can be set to true when the device is set to "cpu"
    bf16: false

tts: enable: false args: device: "cpu" voice: "default" stream_mode: false output_audio_path: "./output_audio.wav"

asr_chinese: enable: false

tts_chinese: enable: false args: device: "cpu" spk_id: 0 stream_mode: false output_audio_path: "./output_audio.wav"

retrieval: enable: true args: input_path: "./rag_data/emon-sample"

vector_database: "Qdrant"

    #retrieval_type: "bm25"

safety_checker: enable: false

ner: enable: false args: spacy_model: "en_core_web_lg"

tasks_list: ['textchat', 'retrieval']

VincyZhang commented 8 months ago

Please check your IPEX version. The trace shows that the IPEX version is mismatch with Pytorch.

PyTorch version 2.2.0

VincyZhang commented 8 months ago

the issue is solved by install oneccl.

VincyZhang commented 8 months ago

Hi, thanks for bringing this up and for the suggestion! We'll look into adding the packages you listed as well as whatever other ones are missing

VincyZhang commented 8 months ago

@NeuralChatBot

NeuralChatBot commented 8 months ago

It seems like you're facing an issue with setting up a neural chat server. To resolve this, you need to ensure that you have the required packages installed and properly configured. Here are some steps to follow:

Check your IPEX version: Make sure you have the correct IPEX version compatible with your PyTorch version.
Install the missing packages: Install the packages mentioned in the traceback, such as oneccl, and any other missing packages.
Configure the server: Make sure you have the correct configuration in your yaml file. You can refer to the provided yaml config file for guidance.
Start the server: Run the command "neuralchat_server start --config_file ./config/neuralchat.yaml" to start the server.
Troubleshooting: If the issue persists, you may need to check your system's environment variables, such as "NUMEXPR_MAX_THREADS" and "CUDA_VISIBLE_DEVICES," and ensure they are set correctly.
Seek help: If you're still unable to resolve the issue, you can reach out to the developers of the neural chat server for further assistance.

VincyZhang / intel-extension-for-transformers

failed to create the serving #17

model_name_or_path: "/home/zluo2/TableLlama-model"

tokenizer_name_or_path: ""

peft_model_path: "./models/emon_llama"

support cpu, hpu, xpu, cuda

vector_database: "Qdrant"