replicate / llama-chat

A boilerplate for creating a Llama 3 chat app
https://llama3.replicate.dev
Apache License 2.0
768 stars 290 forks source link

CUDA error: an illegal memory access was encountered #20

Open zeke opened 11 months ago

zeke commented 11 months ago

Seeing this error in the console:

{
  "detail": "CUDA error: an illegal memory access was encountered\\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions."
}
zeke commented 11 months ago

Seems to always happen on the second message. First chat interaction works, then the second fails.

Screenshot 2023-08-04 at 1 49 04 PM
fofr commented 10 months ago

@zeke Is this still happening for you?