Closed qgq99 closed 1 year ago
Hi, @qgq99, thanks for your interest in our work. Happy to answer your question, nothing offensive.
Could you share your training command? When you execute the command, you may pass a custom path for the pretrained weights using something like the command below for Swin-L OneFormer:
python train_net.py --dist-url 'tcp://127.0.0.1:50163' \
--num-gpus 8 \
--config-file configs/ade20k/swin/oneformer_swin_large_bs16_160k.yaml \
MODEL.WEIGHTS <PATH-TO-CHECKPOINT-HERE> \
OUTPUT_DIR outputs/ade20k_swin_large WANDB.NAME ade20k_swin_large
You can get the pretrained weights using the instructions here.
Hi, @praeclarumjj3, thank you for your reply!
The training command I used is in the same format as shown in the file GETTING_STARTED.md
, specifically as follows:
python train_net.py --dist-url 'tcp://127.0.0.1:50163' \
--num-gpus 2 \
--config-file configs/ade20k/oneformer_swin_tiny_bs16_160k.yaml \
OUTPUT_DIR outputs/ade20k_swin_large WANDB.NAME ade20k_swin_tiny
@qgq99, you need to specify MODE.WEIGHTS
, as I suggested in my previous comment. Please try the following command:
python train_net.py --dist-url 'tcp://127.0.0.1:50163' \
--num-gpus 2 \
--config-file configs/ade20k/oneformer_swin_tiny_bs16_160k.yaml \
MODEL.WEIGHTS <PATH-TO-CHECKPOINT-HERE> \
OUTPUT_DIR outputs/ade20k_swin_large WANDB.NAME ade20k_swin_tiny
@praeclarumjj3 I have successfully start training. Thanks a lot again for your help and this excellent work!
Appreciate your effort in research work of OneFormer. I got this error when I started an independant training:
I find it occurs because of the call of the method
trainer.resume_or_load(resume=args.resume)
, which is located at line 424 of filetrain_net.py
. It will try to load checkpoint from theMODEL.WEIGHTS
of the config file. And I don't konw how to get it, could someone help me? This may be a low-level mistake, I'm a student, please don't take offense.