Open aslanxie opened 1 month ago
No problem on singe node.
It's no problem on single node.
So you see this issue on 2 nodes right?
Yes, it's only on 2 nodes with Llama-2-70b-hf or Llama-3.1-70B-Instruct.
@aslanxie Still seeing this issue installing Optimum Habana's main branch from source?
System Info
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
python3 ../gaudi_spawn.py --hostfile hostfile --use_deepspeed --world_size 16 --master_port 29500 \ run_generation.py \ --model_name_or_path /data1/zhixue/Llama-3.1-70B-Instruct/ \ --bf16 \ --batch_size 1 \ --use_hpu_graphs --limit_hpu_graphs \ --max_new_tokens 512
10.233.108.205: Input/outputs: 10.233.108.205: input 1: ('DeepSpeed is a machine learning framework',) 10.233.108.205: output 1: ('DeepSpeed is a machine learning framework!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!',)
Expected behavior
If test with model Llama-2-7b-hf, the output is below. I found the issue on the latest meta-llama-3 and meta-llama-3.1 with 2 nodes inference.
10.233.108.205: input 1: ('DeepSpeed is a machine learning framework',) 10.233.108.205: output 1: ('DeepSpeed is a machine learning framework for deep learning. It is designed to be fast and efficient, while also being easy to use. DeepSpeed is based on the TensorFlow framework, and it uses the TensorFlow Lite library to run on mobile devices.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is',)