Llama device - Githubissues

This PR introduces the ability to specify the target device (e.g., CUDA, NPU, XPU) for Llama inference. The changes allow users to select the desired device through configuration, improving flexibility across different hardware platforms.

npu

python projects/Llama/pipeline.py --device=npu --mode=huggingface --config_file=projects/Llama/configs/llama_config_npu.py

xpu

python projects/Llama/pipeline.py --device=xpu --mode=huggingface --config_file=projects/Llama/configs/llama_config_xpu.py

cuda

Please update the projects/Llama/configs/llama_config.py file to configure the model path and tokenizer path.

python projects/Llama/pipeline.py

Oneflow-Inc / libai

Llama device #548