open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.21k stars 9.4k forks source link

RuntimeError: handle_0 INTERNAL ASSERT FAILED at "../c10/cuda/driver_api.cpp":15, please report a bug to PyTorch. ```none #11753

Open AIzealotwu opened 4 months ago

AIzealotwu commented 4 months ago

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug When I used the mask2former for instance segmentation, an error came out. mask_pred = mask_pred[is_thing] RuntimeError: handle_0 INTERNAL ASSERT FAILED at "../c10/cuda/driver_api.cpp":15, please report a bug to PyTorch.

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

  2. What dataset did you use? A segmentation dataset Environment

  3. Please run python mmdet/utils/collect_env.py to collect necessary environment information and paste it here.

  4. You may add addition that may be helpful for locating the problem, such as

    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback If applicable, paste the error trackback here. return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] 05/29 16:12:42 - mmengine - INFO - Saving checkpoint at 44 iterations Traceback (most recent call last): File "tools/train.py", line 121, in main() File "tools/train.py", line 117, in main runner.train() File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 294, in run self.runner.val_loop.run() File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 373, in run self.run_iter(idx, data_batch) File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 393, in run_iter outputs = self.runner.model.val_step(data_batch) File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 133, in val_step return self._run_forward(data, mode='predict') # type: ignore File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward results = self(data, mode=mode) File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 94, in forward return self.predict(inputs, data_samples) File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmdet/models/detectors/maskformer.py", line 103, in predict results_list = self.panoptic_fusion_head.predict( File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmdet/models/seg_heads/panoptic_fusion_heads/maskformer_fusion_head.py", line 255, in predict ins_results = self.instance_postprocess( File "/home/xuym/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmdet/models/seg_heads/panoptic_fusion_heads/maskformer_fusion_head.py", line 167, in instance_postprocess mask_pred = mask_pred[is_thing] RuntimeError: handle_0 INTERNAL ASSERT FAILED at "../c10/cuda/driver_api.cpp":15, please report a bug to PyTorch.

A placeholder for trackback.

Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!