Open zcswdt opened 10 months ago
I fix it by removing this line
It's very powerful. I tried for a long time but couldn't solve it. You deleted this place. Have you verified that the code function is OK?
我通过删除这一行来修复它
I fix it by removing this line
https://github.com/real-stanford/cloth-funnels/blob/1fb231e0633b0603eb940aec130ab903e41c2d03/cloth_funnels/PyFlex/bindings/opengl/shader.cpp#L81 I just tried, but there is still a memory leak. Have you tried?
ray.exceptions.OutOfMemoryError: Task was killed due to the node running low on memory.
Memory on the node (IP: 100.79.61.171, ID: dd64af687cb1299d0339f770b1cf8002a43d25240e4f9a1aea3abe31) where the task (actor ID: 198f27dd8eedd8a529db2fdc01000000, name=SimEnv.init, pid=25063, memory used=2.28GB) was running was 29.74GB / 31.30GB (0.950172), which exceeds the memory usage threshold of 0.95. Ray killed this worker (ID: cfe2287dcee7a2bc8a0bef249ec702f3351b919a353a178bc04fa50b) because it was the most recently scheduled task; to see more information about memory usage on this node, use ray logs raylet.out -ip 100.79.61.171
. To see the logs of the worker, use `ray logs worker-cfe2287dcee7a2bc8a0bef249ec702f3351b919a353a178bc04fa50b*out -ip 100.79.61.171. Top 10 memory users:
PID MEM(GB) COMMAND
My code functions well now, but there are many OpenGL warnings, and it can be run directly. I modified the code and did not use any Ray-related function for the running.
My code functions well now, but there are many OpenGL warnings, and it can be run directly. I modified the code and did not use any Ray-related function for the running.
Can you explain why this place was deleted? Can you also tell me your computer configuration? Run, nvidia-smi, nvcc -V, and free -h with your ubuntu version. Thanks.
I have fixed the warning now(I made some mistakes by myself), and the code can be runned only by removing this line. The reason I removed this space is that I got the error of assertion error on this line.
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:2B:00.0 On | N/A |
| 0% 51C P8 38W / 420W | 1481MiB / 24576MiB | 54% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
total used free shared buff/cache available
Mem: 62Gi 10Gi 15Gi 1.0Gi 36Gi 50Gi
Swap: 47Mi 47Mi 0.0Ki
I have fixed the warning now(I made some mistakes by myself), and the code can be runned only by removing this line. The reason I removed this space is that I got the error of assertion error on this line.
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:2B:00.0 On | N/A | | 0% 51C P8 38W / 420W | 1481MiB / 24576MiB | 54% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Sep_21_10:33:58_PDT_2022 Cuda compilation tools, release 11.8, V11.8.89 Build cuda_11.8.r11.8/compiler.31833905_0
total used free shared buff/cache available Mem: 62Gi 10Gi 15Gi 1.0Gi 36Gi 50Gi Swap: 47Mi 47Mi 0.0Ki
Thank you very much for providing this to me. Can you explain why deleting this line of code can solve it?
Because I got the assertion error from this line. I tried to remove this and found the code functions well.
Because I got the assertion error from this line. I tried to remove this and found the code functions well.
Got it, try training for half an hour and see if the memory leaks. Maybe the driver and CUDA you and I use are different. My cuda is 10.0
Can you re-upload the code? Thank you very much