BBBBchan / CorrMatch

Official code for "CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation"
117 stars 8 forks source link

some issues #14

Closed Liumomo30 closed 3 months ago

Liumomo30 commented 4 months ago

您好,感谢您在代码方面的贡献,但我在复现的时候遇到些问题。我使用一张GPU基于提供的数据集进行复现,遇到些我无法解决的问题,可以帮助我一下吗,万分感谢 File "/home/liumomo/实验/ssod/CorrMatch-main/corrmatch.py", line 310, in main() File "/home/liumomo/实验/ssod/CorrMatch-main/corrmatch.py", line 140, in main for i, ((img_x, mask_x), File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in next data = self._next_data() File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data return self._process_data(data) File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data data.reraise() File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/_utils.py", line 543, in reraise raise exception IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/liumomo/实验/ssod/CorrMatch-main/dataset/semi.py", line 66, in getitem ignore_mask[mask == 254] = 255 IndexError: too many indices for tensor of dimension 2

0%| | 0/17 [00:00<?, ?it/s] ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 28033) of binary: /home/liumomo/anaconda3/envs/corrmatch/bin/python Traceback (most recent call last): File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/distributed/launch.py", line 195, in main() File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/distributed/launch.py", line 191, in main launch(args) File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/distributed/launch.py", line 176, in launch run(args) File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/distributed/run.py", line 753, in run elastic_launch( File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 132, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/liumomo/anaconda3/envs/corrmatch/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

corrmatch.py FAILED

Failures:

------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2024-05-28_15:41:17 host : liumomo-B760-VH4 rank : 0 (local_rank: 0) exitcode : 1 (pid: 28033) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================
Liumomo30 commented 4 months ago

![Uploading issues.png…]()

BBBBchan commented 4 months ago

您好,您可以尝试查看一下mask和ignore_mask的格式和形状大小,确保mask被正确读取,并且两者有相同的形状。

Hotcat-s commented 4 months ago

Can you reproduce the code now? I have some troubles in loss.backward

BBBBchan commented 4 months ago

Can you reproduce the code now? I have some troubles in loss.backward

Hi, could you please describe your issue in detail?