Open jashshah999 opened 6 months ago
Getting this error while training, please help!
Epoch 1902/5000 /home/jash/modara/NeuralClothSim/ncs/utils/rotation.py:106: RuntimeWarning: invalid value encountered in divide w0 = np.where(sin_omega, np.sin((1 - r) omega) / sin_omega, 1 - r) /home/jash/modara/NeuralClothSim/ncs/utils/rotation.py:107: RuntimeWarning: invalid value encountered in divide w1 = np.where(sin_omega, np.sin(r omega) / sin_omega, r) 4/4 [==============================] - ETA: 0s - m/Loss: 0.2198 - m/Stretch: 3.2042e-06 - m/Shear: 0.0687 - m/Bending: 0.1241 - m/Collision: 0.0000e+00 - m/Gravity: 0.2032 - m/Inertia: 0.02882024-03-15 03:43:09.659013: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:1542] failed to enqueue async memcpy from device to host: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure; host dst: 0x1fd9c900; GPU src: 0x722b6f95b000; size: 53088=0xcf60 2024-03-15 03:43:09.659080: E external/local_xla/xla/stream_executor/stream.cc:340] Error recording event in stream: Error recording CUDA event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. 2024-03-15 03:43:09.659093: E external/local_xla/xla/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure 2024-03-15 03:43:09.659101: F tensorflow/core/common_runtime/device/device_event_mgr.cc:223] Unexpected Event status: 1 Aborted (core dumped)
got the same error at the first epoch for the author's sample configuration.. does anyone know how to solve this?
Getting this error while training, please help!
Epoch 1902/5000 /home/jash/modara/NeuralClothSim/ncs/utils/rotation.py:106: RuntimeWarning: invalid value encountered in divide w0 = np.where(sin_omega, np.sin((1 - r) omega) / sin_omega, 1 - r) /home/jash/modara/NeuralClothSim/ncs/utils/rotation.py:107: RuntimeWarning: invalid value encountered in divide w1 = np.where(sin_omega, np.sin(r omega) / sin_omega, r) 4/4 [==============================] - ETA: 0s - m/Loss: 0.2198 - m/Stretch: 3.2042e-06 - m/Shear: 0.0687 - m/Bending: 0.1241 - m/Collision: 0.0000e+00 - m/Gravity: 0.2032 - m/Inertia: 0.02882024-03-15 03:43:09.659013: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:1542] failed to enqueue async memcpy from device to host: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure; host dst: 0x1fd9c900; GPU src: 0x722b6f95b000; size: 53088=0xcf60 2024-03-15 03:43:09.659080: E external/local_xla/xla/stream_executor/stream.cc:340] Error recording event in stream: Error recording CUDA event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. 2024-03-15 03:43:09.659093: E external/local_xla/xla/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure 2024-03-15 03:43:09.659101: F tensorflow/core/common_runtime/device/device_event_mgr.cc:223] Unexpected Event status: 1 Aborted (core dumped)