Closed ShawnXuan closed 1 month ago
This PR introduces the ability to specify the target device (e.g., CUDA, NPU, XPU) for Llama inference. The changes allow users to select the desired device through configuration, improving flexibility across different hardware platforms.
npu
python projects/Llama/pipeline.py --device=npu --mode=huggingface --config_file=projects/Llama/configs/llama_config_npu.py
xpu
python projects/Llama/pipeline.py --device=xpu --mode=huggingface --config_file=projects/Llama/configs/llama_config_xpu.py
cuda
Please update the projects/Llama/configs/llama_config.py file to configure the model path and tokenizer path.
python projects/Llama/pipeline.py
This PR introduces the ability to specify the target device (e.g., CUDA, NPU, XPU) for Llama inference. The changes allow users to select the desired device through configuration, improving flexibility across different hardware platforms.
npu
xpu
cuda
Please update the projects/Llama/configs/llama_config.py file to configure the model path and tokenizer path.