CUDA error: an illegal memory access was encountered

replicate / llama-chat

A boilerplate for creating a Llama 3 chat app

https://llama3.replicate.dev

Apache License 2.0

768 stars 290 forks source link

CUDA error: an illegal memory access was encountered #20

Open zeke opened 11 months ago

zeke commented 11 months ago

Seeing this error in the console:

{
  "detail": "CUDA error: an illegal memory access was encountered\\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions."
}

zeke commented 11 months ago

Seems to always happen on the second message. First chat interaction works, then the second fails.

fofr commented 10 months ago

@zeke Is this still happening for you?