RuntimeError: CUDA error: Unkown error

guillermodecelisrodriguez commented 8 months ago

First thank you for sharing your amazing job!

I did not have any problem using demo.sh to obtain the same results as you, however, when trying the same demo.sh to inference on my own data, first i obtain the corresponding vis_ism.png image of the resulting mask, but when running the pose estimation model, the next error appears:

=> creating model ... load pre-trained checkpoint from: checkpoints/mae_pretrain_vit_base.pth => extracting templates ... => loading input data ... => running model ... Traceback (most recent call last): File "/SAM-6D/Pose_Estimation_Model/run_inference_custom.py", line 291, in out = model(input_data) File "/opt/conda/envs/sam6d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/SAM-6D/Pose_Estimation_Model/../Pose_Estimation_Model/model/pose_estimation_model.py", line 32, in forward geo_embedding_m = self.geo_embedding(torch.cat([bg_point, sparse_pm], dim=1)) File "/opt/conda/envs/sam6d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/SAM-6D/Pose_Estimation_Model/../Pose_Estimation_Model/model/transformer.py", line 337, in forward d_embeddings = self.embedding(d_indices) File "/opt/conda/envs/sam6d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/SAM-6D/Pose_Estimation_Model/../Pose_Estimation_Model/model/transformer.py", line 280, in forward embeddings = torch.cat([sin_embeddings, cos_embeddings], dim=2) # (-1, d_model/2, 2) RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

I would really appreciate any help with this. Thanks in advance!!

assia855 commented 8 months ago

Hi @guillermodecelisrodriguez I'm struggling with thesame error. Did you find a solution please?

guillermodecelisrodriguez commented 8 months ago

hello @assia855 i did not find a solution yet, but if i do i will tell you :)

JiehongLin commented 8 months ago

We are sorry for the error. We have not encountered this problem with our machine.

Maybe you can validate whether the variables, e.g., sin_embeddings, cos_embeddings, emb_indices and self.div_term, are on the same devices.

JiehongLin / SAM-6D

RuntimeError: CUDA error: Unkown error #18