Open ZexinLi0w0 opened 9 months ago
Hi,
Referring to the issue #138, are you currently using WSL2? We recommend using Ubuntu directly rather than through virtual environments.
Guangxuan
Hi,
Referring to the issue #138, are you currently using WSL2? We recommend using Ubuntu directly rather than through virtual environments.
Guangxuan
No, I use a Ubuntu server directly.
It seems like a PyTorch bug. Can you try reinstalling PyTorch or report it to the PyTorch community?
I will try reinstalling PyTorch then.
The runtime error is triggered for about ~10 minutes. I guess that it may be triggered by GPU out-of-memory but PyTorch does not correctly handle it.
Here is some nvidia-smi
profiling log, the 3rd/4th/5th rows correspond to total GPU memory, allocated GPU memory, and available GPU memory.
2023/10/12 13:04:29.337, 0, 20470, 20135, 42, 57, 6
2023/10/12 13:04:29.456, 0, 20470, 19907, 270, 63, 23
2023/10/12 13:04:29.577, 0, 20470, 19771, 406, 63, 23
2023/10/12 13:04:29.697, 0, 20470, 19771, 406, 54, 5
2023/10/12 13:04:29.816, 0, 20470, 19907, 270, 58, 6
2023/10/12 13:04:29.937, 0, 20470, 19907, 270, 58, 6
2023/10/12 13:04:30.059, 0, 20470, 19871, 306, 57, 6
2023/10/12 13:04:30.194, 0, 20470, 20135, 42, 57, 6
2023/10/12 13:04:30.318, 0, 20470, 19821, 356, 43, 4
2023/10/12 13:04:30.440, 0, 20470, 19821, 356, 64, 23
2023/10/12 13:04:30.561, 0, 20470, 19821, 356, 64, 23
2023/10/12 13:04:30.681, 0, 20470, 19957, 220, 54, 6
2023/10/12 13:04:30.803, 0, 20470, 19957, 220, 58, 6
2023/10/12 13:04:30.923, 0, 20470, 19957, 220, 58, 6
2023/10/12 13:04:31.042, 0, 20470, 20135, 42, 57, 6
2023/10/12 13:04:31.163, 0, 20470, 19907, 270, 57, 6
2023/10/12 13:04:31.284, 0, 20470, 19771, 406, 65, 23
2023/10/12 13:04:31.405, 0, 20470, 19907, 270, 59, 6
2023/10/12 13:04:31.526, 0, 20470, 19907, 270, 59, 6
2023/10/12 13:04:31.646, 0, 20470, 19907, 270, 59, 6
2023/10/12 13:04:31.765, 0, 20470, 19821, 356, 59, 6
2023/10/12 13:04:31.887, 0, 20470, 20135, 42, 45, 4
2023/10/12 13:04:32.008, 0, 20470, 19821, 356, 66, 23
2023/10/12 13:04:32.129, 0, 20470, 19821, 356, 66, 23
2023/10/12 13:04:32.251, 0, 20470, 19957, 220, 55, 6
2023/10/12 13:04:32.372, 0, 20470, 19957, 220, 55, 6
2023/10/12 13:04:32.494, 0, 20470, 19957, 220, 56, 6
2023/10/12 13:04:32.615, 0, 20470, 19871, 306, 44, 4
2023/10/12 13:04:32.735, 0, 20470, 20135, 42, 44, 4
2023/10/12 13:04:32.855, 0, 20470, 19771, 406, 67, 23
2023/10/12 13:04:32.976, 0, 20470, 19907, 270, 67, 23
2023/10/12 13:04:33.097, 0, 20470, 19907, 270, 57, 6
2023/10/12 13:04:33.214, 0, 20470, 19771, 406, 58, 6
2023/10/12 13:04:33.332, 0, 20470, 19821, 356, 58, 6
2023/10/12 13:04:33.453, 0, 20470, 19871, 306, 59, 6
2023/10/12 13:04:33.573, 0, 20470, 19871, 306, 59, 6
2023/10/12 13:04:33.696, 0, 20470, 19871, 306, 43, 4
2023/10/12 13:04:33.817, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:33.937, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:34.058, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:34.181, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:34.303, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:34.424, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:34.544, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:34.665, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:34.787, 0, 20470, 19871, 306, 0, 0
2023/10/12 13:04:35.001, 0, 20470, 589, 19588, 36, 3
2023/10/12 13:04:35.122, 0, 20470, 589, 19588, 36, 3
2023/10/12 13:04:35.244, 0, 20470, 589, 19588, 100, 7
2023/10/12 13:04:35.364, 0, 20470, 589, 19588, 100, 7
2023/10/12 13:04:35.485, 0, 20470, 589, 19588, 100, 7
2023/10/12 13:04:35.606, 0, 20470, 589, 19588, 55, 4
2023/10/12 13:04:35.727, 0, 20470, 589, 19588, 55, 4
2023/10/12 13:04:35.853, 0, 20470, 589, 19588, 0, 0
2023/10/12 13:04:35.976, 0, 20470, 589, 19588, 0, 0
2023/10/12 13:04:36.096, 0, 20470, 589, 19588, 0, 0
2023/10/12 13:04:36.217, 0, 20470, 589, 19588, 0, 0
Description: When running the run_streaming_llama.py script with the --enable_streaming flag, I encountered a RuntimeError related to CUDA and the PyTorch Accelerate library.
Steps to Reproduce: Set the environment variable: CUDA_VISIBLE_DEVICES=0 Run the following command:
Expected Behavior: The script should run successfully and provide streaming inference results.
Actual Behavior: The script crashes with the following error:
GPU information
OS information:
Full error log: