Open aryanmangal769 opened 10 months ago
We train the model for 150 epochs. 38th epoch might be just warm-up. Maybe you can try to load some pretrained weights to accelerate the training?
@aryanmangal769 bro! How to train the model on One GPU?
@me I add os.environ['MASTER_PORT'] = '8889'
in main.py
It is not related to the port. Make --nproc_per_node=1 pls
When I try to train on single GPU, the error keeps on increasing and I cannot see any good results even thill 38th epoch.
train_class_error starts from 97.88 and deom 19th to 37th epoch its consistently 100. Can you debug this?
Please let me know if you need some more information