When I tried to use chatglm3-6b to test on LongBench, I got the following error after loading the model:
"variance = hidden_states.to(torch.float32).pow(2).mean(-1, keepdim=True)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions."
When I tried to use
chatglm3-6b
to test onLongBench
, I got the following error after loading the model:Could someone help me?