SHI-Labs / OneFormer

OneFormer: One Transformer to Rule Universal Image Segmentation, arxiv 2022 / CVPR 2023
https://praeclarumjj3.github.io/oneformer
MIT License
1.45k stars 129 forks source link

AssertionError when trying to reproduce result #104

Open gobears21 opened 10 months ago

gobears21 commented 10 months ago

When I trying to reproduce the result, using following command:

python train_net.py --dist-url 'tcp://127.0.0.1:50163' --num-gpus 2 --config-file configs/coco/swin/oneformer_swin_tiny_bs16_50ep.yaml OUTPUT_DIR outputs/coco_swin_tiny WANDB.NAME coco_swin_tiny I got a missing file error :

this is the error code:

[11/14 23:17:13 fvcore.common.checkpoint]: [Checkpointer] Loading from swin_tiny_patch4_window7_224.pkl ...
Traceback (most recent call last):
  File "train_net.py", line 435, in <module>
    launch(
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/launch.py", line 69, in launch
    mp.start_processes(
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/launch.py", line 123, in _distributed_worker
    main_func(*args)
  File "/mnt/OneFormer/train_net.py", line 424, in main
    trainer.resume_or_load(resume=args.resume)
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 414, in resume_or_load
    self.checkpointer.resume_or_load(self.cfg.MODEL.WEIGHTS, resume=resume)
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/fvcore/common/checkpoint.py", line 227, in resume_or_load
    return self.load(path, checkpointables=[])
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/detectron2/checkpoint/detection_checkpoint.py", line 62, in load
    ret = super().load(path, *args, **kwargs)
  File "/opt/conda/envs/oneformer/lib/python3.8/site-packages/fvcore/common/checkpoint.py", line 153, in load
    assert os.path.isfile(path), "Checkpoint {} not found!".format(path)
AssertionError: Checkpoint swin_tiny_patch4_window7_224.pkl not found!
huydung179 commented 10 months ago

I have the same problem

huydung179 commented 10 months ago

Oh I see how to deal with this here