The error log is attached as below.
[2023-11-14 11:19:42] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-11-14 11:19:42] WARNING - __init__.py - Failed to import pytorch_quantization [2023-11-14 11:19:43] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-11-14 11:19:43] WARNING - export.py - Failed to import pytorch_quantization [2023-11-14 11:19:43] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization [2023-11-14 11:19:43] INFO - detection_dataset.py - Dataset Initialization in progress.cache_annotations=Truecauses the process to take longer due to full dataset indexing. Indexing dataset annotations: 100%|ββββββββββ| 87/87 [00:00<00:00, 17532.53it/s] [2023-11-14 11:19:43] INFO - detection_dataset.py - Dataset Initialization in progress.cache_annotations=Truecauses the process to take longer due to full dataset indexing. Indexing dataset annotations: 100%|ββββββββββ| 4/4 [00:00<00:00, 10174.18it/s] [2023-11-14 11:19:43] INFO - checkpoint_utils.py - License Notification: YOLO-NAS pre-trained weights are subjected to the specific license terms and conditions detailed in https://github.com/Deci-AI/super-gradients/blob/master/LICENSE.YOLONAS.md By downloading the pre-trained weight files you agree to comply with these terms. [2023-11-14 11:19:43] INFO - checkpoint_utils.py - Successfully loaded pretrained weights for architecture yolo_nas_s [2023-11-14 11:19:44] INFO - sg_trainer.py - Starting a new run withrun_id=RUN_20231114_111944_239051`
[2023-11-14 11:19:44] INFO - sg_trainer.py - Checkpoints directory: checkpoints/ylsff2/RUN_20231114_111944_239051
[2023-11-14 11:19:44] INFO - sg_trainer.py - Using EMA with params {'decay': 0.9, 'decay_type': 'threshold'}
The console stream is now moved to checkpoints/ylsff2/RUN_20231114_111944_239051/console_Nov14_11_19_44.txt
/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/numpy/lib/arraypad.py:487: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
x = np.array(x)
/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/numpy/lib/arraypad.py:487: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
x = np.array(x)
[2023-11-14 11:19:47] INFO - sg_trainer_utils.py - TRAINING PARAMETERS:
Gradient updates per epoch: 5 (len(train_loader) / batch_accumulate)
[2023-11-14 11:19:47] INFO - sg_trainer.py - Started training for 201 epochs (0/200)
Train epoch 0: 0%| | 0/5 [00:00<?, ?it/s]/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/numpy/lib/arraypad.py:487: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
x = np.array(x)
/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/numpy/lib/arraypad.py:487: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
x = np.array(x)
Train epoch 0: 100%|ββββββββββ| 5/5 [00:04<00:00, 1.19it/s, PPYoloELoss/loss=4, PPYoloELoss/loss_cls=2.11, PPYoloELoss/loss_dfl=0.851, PPYoloELoss/loss_iou=1.04, gpu_mem=5.47]
Validating: | | 0/0 [00:00<?, ?it/s]
/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: The compute method of metric DetectionMetrics_050 was called before the update method which may lead to errors, as metric states have not yet been updated.
warnings.warn(*args, **kwargs)
[2023-11-14 11:19:51] INFO - base_sg_logger.py - [CLEANUP] - Successfully stopped system monitoring process
[2023-11-14 11:19:51] ERROR - sg_trainer_utils.py - Uncaught exception
Traceback (most recent call last):
File "/home/lu/workspace/yolonas/test2.py", line 107, in
trainer.train(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 1527, in train
self._write_to_disk_operations(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 1963, in _write_to_disk_operations
self._save_checkpoint(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 661, in _save_checkpoint
curr_tracked_metric = float(validation_results_dict[self.metric_to_watch])
KeyError: 'mAP@0.50'
Traceback (most recent call last):
File "/home/lu/workspace/yolonas/test2.py", line 107, in
trainer.train(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 1527, in train
self._write_to_disk_operations(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 1963, in _write_to_disk_operations
self._save_checkpoint(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 661, in _save_checkpoint
curr_tracked_metric = float(validation_results_dict[self.metric_to_watch])
KeyError: 'mAP@0.50'
`
π‘ Your Question
The error log is attached as below.
[2023-11-14 11:19:42] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it [2023-11-14 11:19:42] WARNING - __init__.py - Failed to import pytorch_quantization [2023-11-14 11:19:43] WARNING - calibrator.py - Failed to import pytorch_quantization [2023-11-14 11:19:43] WARNING - export.py - Failed to import pytorch_quantization [2023-11-14 11:19:43] WARNING - selective_quantization_utils.py - Failed to import pytorch_quantization [2023-11-14 11:19:43] INFO - detection_dataset.py - Dataset Initialization in progress.
cache_annotations=Truecauses the process to take longer due to full dataset indexing. Indexing dataset annotations: 100%|ββββββββββ| 87/87 [00:00<00:00, 17532.53it/s] [2023-11-14 11:19:43] INFO - detection_dataset.py - Dataset Initialization in progress.
cache_annotations=Truecauses the process to take longer due to full dataset indexing. Indexing dataset annotations: 100%|ββββββββββ| 4/4 [00:00<00:00, 10174.18it/s] [2023-11-14 11:19:43] INFO - checkpoint_utils.py - License Notification: YOLO-NAS pre-trained weights are subjected to the specific license terms and conditions detailed in https://github.com/Deci-AI/super-gradients/blob/master/LICENSE.YOLONAS.md By downloading the pre-trained weight files you agree to comply with these terms. [2023-11-14 11:19:43] INFO - checkpoint_utils.py - Successfully loaded pretrained weights for architecture yolo_nas_s [2023-11-14 11:19:44] INFO - sg_trainer.py - Starting a new run with
run_id=RUN_20231114_111944_239051` [2023-11-14 11:19:44] INFO - sg_trainer.py - Checkpoints directory: checkpoints/ylsff2/RUN_20231114_111944_239051 [2023-11-14 11:19:44] INFO - sg_trainer.py - Using EMA with params {'decay': 0.9, 'decay_type': 'threshold'} The console stream is now moved to checkpoints/ylsff2/RUN_20231114_111944_239051/console_Nov14_11_19_44.txt /home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/numpy/lib/arraypad.py:487: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. x = np.array(x) /home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/numpy/lib/arraypad.py:487: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. x = np.array(x) [2023-11-14 11:19:47] INFO - sg_trainer_utils.py - TRAINING PARAMETERS:[2023-11-14 11:19:47] INFO - sg_trainer.py - Started training for 201 epochs (0/200)
Train epoch 0: 0%| | 0/5 [00:00<?, ?it/s]/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/numpy/lib/arraypad.py:487: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. x = np.array(x) /home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/numpy/lib/arraypad.py:487: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. x = np.array(x) Train epoch 0: 100%|ββββββββββ| 5/5 [00:04<00:00, 1.19it/s, PPYoloELoss/loss=4, PPYoloELoss/loss_cls=2.11, PPYoloELoss/loss_dfl=0.851, PPYoloELoss/loss_iou=1.04, gpu_mem=5.47] Validating: | | 0/0 [00:00<?, ?it/s] /home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: The
trainer.train(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 1527, in train
self._write_to_disk_operations(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 1963, in _write_to_disk_operations
self._save_checkpoint(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 661, in _save_checkpoint
curr_tracked_metric = float(validation_results_dict[self.metric_to_watch])
KeyError: 'mAP@0.50'
Traceback (most recent call last):
File "/home/lu/workspace/yolonas/test2.py", line 107, in
trainer.train(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 1527, in train
self._write_to_disk_operations(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 1963, in _write_to_disk_operations
self._save_checkpoint(
File "/home/lu/anaconda3/envs/sg/lib/python3.8/site-packages/super_gradients/training/sg_trainer/sg_trainer.py", line 661, in _save_checkpoint
curr_tracked_metric = float(validation_results_dict[self.metric_to_watch])
KeyError: 'mAP@0.50'
`
compute
method of metric DetectionMetrics_050 was called before theupdate
method which may lead to errors, as metric states have not yet been updated. warnings.warn(*args, **kwargs) [2023-11-14 11:19:51] INFO - base_sg_logger.py - [CLEANUP] - Successfully stopped system monitoring process [2023-11-14 11:19:51] ERROR - sg_trainer_utils.py - Uncaught exception Traceback (most recent call last): File "/home/lu/workspace/yolonas/test2.py", line 107, inAnd the code is as below.
If I comment out
valid_metric_list
and modifymetric_to_watch
asPPYoloELoss/loss_cls
, it can run while some errors exist when validating, the log is as follows: ` SUMMARY OF EPOCH 1 βββ Train β βββ Ppyoloeloss/loss_cls = 1.8052 β β βββ Epoch N-1 = 1.8946 (β -0.0894) β β βββ Best until now = 1.8946 (β -0.0894) β βββ Ppyoloeloss/loss_iou = 1.0784 β β βββ Epoch N-1 = 1.0313 (β 0.0471) β β βββ Best until now = 1.0313 (β 0.0471) β βββ Ppyoloeloss/loss_dfl = 0.9002 β β βββ Epoch N-1 = 0.8675 (β 0.0327) β β βββ Best until now = 0.8675 (β 0.0327) β βββ Ppyoloeloss/loss = 3.7839 β βββ Epoch N-1 = 3.7934 (β -0.0095) β βββ Best until now = 3.7934 (β -0.0095) βββ Validation βββ Ppyoloeloss/loss_cls = 0 β βββ Epoch N-1 = 0 (= 0) β βββ Best until now = 0 (= 0) βββ Ppyoloeloss/loss_iou = None βββ Ppyoloeloss/loss_dfl = None βββ Ppyoloeloss/loss = None`
Versions
No response