Open ra-MANUJ-an opened 1 year ago
Can you check if your prompt
is shorter than 2047 tokens?
@mkardas I shortened the prompt this time, it was exceeding before. It ran for fewer more iterations and then stopped giving the same error:
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Can there be problem with the version of pytorch, cuda or python? Is it possible to tell the versions of dependencies used?
You can run python -m torch.utils.collect_env
as well as pip list
. What's the prompt's length in tokens now? By "ran for fewer more iterations" do you mean with the exact same prompt, or with appending the generations to the prompt?
Hi All, I'm trying to do inference using galactica-6.7B model but errors have been popping up after inferencing few examples, and I'm not sure what to do. Can anyone look at them and tell?
following is the error
and have been using following code: