Closed wangzhaoyang-508 closed 9 months ago
Sry for the late reply, we did not implement the hydra and slurm scripts by ourselves, maybe you can refer to this PR for more details: https://github.com/IDEA-Research/detrex/pull/215
Sry for the late reply, we did not implement the hydra and slurm scripts by ourselves, maybe you can refer to this PR for more details: #215
It is a same bug with #216 I tried
pip uninstall detrex && pip uninstall detectron2 && pip install -e . && pip install -e detectron2 (tested)
and The problem was solved
Seems like the problem has been solved, so I'm closing this issue, feel free to reopen it if needed.
When I was training with hydra and slurm, the configuration that the project read from the command line looked like a .yaml file instead of a .py file, but,when i use tran_net.py it can train very nice .
see the code
enter
python tools/hydra_train_net.py \ num_machines=2 num_gpus=4 auto_output_dir=true \ config_file=projects/dino/configs/dino-resnet/dino_r50_4scale_12ep_custom1.py \ +model.num_queries=50 \ +slurm=Nvidia_A800
.sh file
logs
error log