I face an error related to position_ids: index out of bounds. During debugging, I found that this error occurs because position_ids.max returns an unusually large value, 3974362539628152236,
Specifically, the error message is: "../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [233,0,0], thread: [0,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed." In my debugging process for transformers/models/llama/modeling_llama.py, I found that the value of position_ids is normal at the beginning, but it suddenly overflows after a while.
I've strictly followed MMVP's instructions for setting up the environment, so I don't know what's wrong.
@tsb0601 I found that this error does not occur when evaluating with a single H800 GPU, but it does occur when using multiple GPU. So, is multi-GPU evaluation not supported at the moment?
Hello,
I've encountered an issue while running into
transformers/models/llama/modeling_llama.py
. When I execute the following command use 4 gpu 4090:I face an error related to position_ids: index out of bounds. During debugging, I found that this error occurs because position_ids.max returns an unusually large value, 3974362539628152236, Specifically, the error message is: "../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [233,0,0], thread: [0,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed." In my debugging process for transformers/models/llama/modeling_llama.py, I found that the value of position_ids is normal at the beginning, but it suddenly overflows after a while. I've strictly followed MMVP's instructions for setting up the environment, so I don't know what's wrong.