Open Rohinivv96 opened 1 year ago
What output is this? Stdout? Logfile?
Usually during training you should be seeing nice output with loss, epoch number, metrics, etc in your console:
===========================================================
Train epoch 33: 100%|██████████| 250/250 [05:10<00:00, 1.24s/it, DEKRLoss/heatmap=0.00039, DEKRLoss/offset=0.000749, DEKRLoss/total=0.00114, gpu_mem=13.4]
Validation epoch 33: 100%|██████████| 20/20 [00:24<00:00, 1.23s/it]
===========================================================
SUMMARY OF EPOCH 33
├── Training
│ ├── Dekrloss/heatmap = 0.0004
│ │ ├── Best until now = 0.0004 (↘ -0.0)
│ │ └── Epoch N-1 = 0.0004 (↘ -0.0)
│ ├── Dekrloss/offset = 0.0007
│ │ ├── Best until now = 0.0007 (↗ 0.0)
│ │ └── Epoch N-1 = 0.0007 (↗ 0.0)
│ └── Dekrloss/total = 0.0011
│ ├── Best until now = 0.0011 (↘ -0.0)
│ └── Epoch N-1 = 0.0011 (↘ -0.0)
└── Validation
├── Ap = 0.351
│ ├── Best until now = 0.3775 (↘ -0.0265)
│ └── Epoch N-1 = 0.3705 (↘ -0.0195)
├── Ar = 0.4672
│ ├── Best until now = 0.4898 (↘ -0.0226)
│ └── Epoch N-1 = 0.4898 (↘ -0.0226)
├── Dekrloss/heatmap = 0.0003
│ ├── Best until now = 0.0002 (↗ 0.0)
│ └── Epoch N-1 = 0.0002 (↗ 0.0)
├── Dekrloss/offset = 0.0008
│ ├── Best until now = 0.0007 (↗ 0.0)
│ └── Epoch N-1 = 0.0007 (↗ 0.0)
└── Dekrloss/total = 0.001
├── Best until now = 0.001 (↗ 0.0001)
└── Epoch N-1 = 0.001 (↗ 0.0001)
@Rohinivv96 could this be related to your issue? https://github.com/Deci-AI/super-gradients/issues/1082#issuecomment-1561991026
@BloodAxe Yes. I ran following command for training, trainer.train(model=model, training_params=train_params, train_loader=train_data, valid_loader=val_data) But, I am not getting output on my console as you have shown above. I am just getting as follows:
Is it stdout or log file? Colab or launch from command line? DDP or single GPU? Linux or Windows? Please provide as much information as possible at once, since it's hard to guess in what environment you're doing your experiment and counter-productive to go back and forth with each question.
Hi @Rohinivv96, have you tried to set the value of silent_mode
to False in the train_params
? That should print all the information for you.
Related issue https://github.com/Deci-AI/super-gradients/issues/1289
💡 Your Question
I have initiated the training process for YOLO-NAS using a customized dataset. However, I am currently experiencing difficulty in displaying the ongoing epoch number. Please advise on what modifications I should make in order to enable the display of epoch number during the training process.
Versions
No response