Closed yichenwang231 closed 1 year ago
This might be normal for fully supervised training on 100 labels of urbansound, where the model is learning nothing.
I compared the classifier weights of the model when epoch was set to 8.The classifier weights I tested on the validation set at the end of the fourth epoch were different from the classifier weights tested on the test set when the model loaded into the fourth epoch after training was completed.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
Bug
When I tested UrbanSound8k, I set up 8 epochs and 4 epochs, respectively, and they were the best models stored in the fourth epoch according to the validation set, the losses and cf_mat of the first four epochs are exactly the same,but the results of loading the optimal model at the end of the run are different for the test set .
The logs are as follows:
8epoch; Semi-supervised-learning-main$ python train.py --c config/usb_audio/supervised/supervised_urbansound8k_100_0.yaml train.py:185: UserWarning: You have chosen to seed training. This will turn on the CUDNN deterministic setting, which can slow down your training considerably! You may see unexpected behavior when restarting from checkpoints. warnings.warn('You have chosen to seed training. ' [2023-10-22 14:25:27,612 INFO] Use GPU: None for training /media/ubuntu20/D/wyc/fuxian/Semi-supervised-learning-main/semilearn/datasets/utils.py:38: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. data, targets = np.array(data), np.array(targets) [2023-10-22 14:25:29,227 INFO] unlabeled data number: 7079, labeled data number 100 [2023-10-22 14:25:29,227 INFO] Create train and test data loaders [2023-10-22 14:26:09,317 INFO] [!] data loader keys: dict_keys(['train_lb', 'train_ulb', 'eval', 'test']) [2023-10-22 14:26:29,943 INFO] Create optimizer and scheduler [2023-10-22 14:26:29,945 INFO] Number of Trainable Params: 94969994 [2023-10-22 14:26:31,640 INFO] Arguments: Namespace(algorithm='supervised', amp=False, batch_size=8, c='config/usb_audio/supervised/supervised_urbansound8k_100_0.yaml', clip=0.0, clip_grad=0, crop_ratio=0.875, data_dir='./data', dataset='urbansound8k', dist_backend='nccl', dist_url='tcp://127.0.0.1:29980', distributed=False, ema_m=0.0, epoch=8, eval_batch_size=16, gpu=None, imb_algorithm=None, img_size=32, include_lb_to_ulb=True, layer_decay=0.75, lb_dest_len=100, lb_imb_ratio=1, load_path='./saved_models/usb_audio/supervised_urbansound8k_100_0/latest_model.pth', lr=5e-05, max_length=512, max_length_seconds=4.0, momentum=0.9, multiprocessing_distributed=False, net='hubert_base', net_from_name=False, num_classes=10, num_eval_iter=2048, num_labels=100, num_log_iter=256, num_train_iter=8192, num_warmup_iter=5120, num_workers=4, optim='AdamW', overwrite=True, pretrain_path='', rank=0, resume=False, sample_rate=16000, save_dir='./saved_models/usb_audio', save_name='supervised_urbansound8k_100_0', seed=0, train_sampler='RandomSampler', ulb_dest_len=7079, ulb_imb_ratio=1, ulb_loss_ratio=1.0, ulb_num_labels=None, uratio=1, use_aim=False, use_cat=False, use_pretrain=False, use_tensorboard=True, use_wandb=False, weight_decay=2e-05, world_size=1) [2023-10-22 14:26:31,641 INFO] Resume load path ./saved_models/usb_audio/supervised_urbansound8k_100_0/latest_model.pth does not exist [2023-10-22 14:26:31,641 INFO] Model training [2023-10-22 14:26:48,589 INFO] 256 iteration USE_EMA: False, train/sup_loss: 2.2898, train/run_time: 0.0554, lr: 0.0000, train/prefecth_time: 0.0014 [2023-10-22 14:27:05,295 INFO] 512 iteration USE_EMA: False, train/sup_loss: 2.2459, train/run_time: 0.0535, lr: 0.0000, train/prefecth_time: 0.0012 [2023-10-22 14:27:21,938 INFO] 768 iteration USE_EMA: False, train/sup_loss: 2.0217, train/run_time: 0.0554, lr: 0.0000, train/prefecth_time: 0.0011 [2023-10-22 14:27:38,534 INFO] 1024 iteration USE_EMA: False, train/sup_loss: 1.6001, train/run_time: 0.0537, lr: 0.0000, train/prefecth_time: 0.0010 [2023-10-22 14:27:55,515 INFO] 1280 iteration USE_EMA: False, train/sup_loss: 1.1751, train/run_time: 0.0539, lr: 0.0000, train/prefecth_time: 0.0014 [2023-10-22 14:28:12,193 INFO] 1536 iteration USE_EMA: False, train/sup_loss: 0.6881, train/run_time: 0.0555, lr: 0.0000, train/prefecth_time: 0.0011 [2023-10-22 14:28:28,843 INFO] 1792 iteration USE_EMA: False, train/sup_loss: 0.4191, train/run_time: 0.0575, lr: 0.0000, train/prefecth_time: 0.0012 [2023-10-22 14:28:45,480 INFO] validating... [2023-10-22 14:28:47,310 INFO] confusion matrix: [[0.19 0. 0.11 0.07 0.06 0.16
4epoch; Semi-supervised-learning-main$ python train.py --c config/usb_audio/supervised/supervised_urbansound8k_100_0.yaml train.py:185: UserWarning: You have chosen to seed training. This will turn on the CUDNN deterministic setting, which can slow down your training considerably! You may see unexpected behavior when restarting from checkpoints. warnings.warn('You have chosen to seed training. ' [2023-10-22 14:36:31,817 INFO] Use GPU: None for training /media/ubuntu20/D/wyc/fuxian/Semi-supervised-learning-main/semilearn/datasets/utils.py:38: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. data, targets = np.array(data), np.array(targets) [2023-10-22 14:36:33,448 INFO] unlabeled data number: 7079, labeled data number 100 [2023-10-22 14:36:33,448 INFO] Create train and test data loaders [2023-10-22 14:37:13,542 INFO] [!] data loader keys: dict_keys(['train_lb', 'train_ulb', 'eval', 'test']) [2023-10-22 14:37:34,191 INFO] Create optimizer and scheduler [2023-10-22 14:37:34,193 INFO] Number of Trainable Params: 94969994 [2023-10-22 14:37:35,883 INFO] Arguments: Namespace(algorithm='supervised', amp=False, batch_size=8, c='config/usb_audio/supervised/supervised_urbansound8k_100_0.yaml', clip=0.0, clip_grad=0, crop_ratio=0.875, data_dir='./data', dataset='urbansound8k', dist_backend='nccl', dist_url='tcp://127.0.0.1:22474', distributed=False, ema_m=0.0, epoch=4, eval_batch_size=16, gpu=None, imb_algorithm=None, img_size=32, include_lb_to_ulb=True, layer_decay=0.75, lb_dest_len=100, lb_imb_ratio=1, load_path='./saved_models/usb_audio/supervised_urbansound8k_100_0/latest_model.pth', lr=5e-05, max_length=512, max_length_seconds=4.0, momentum=0.9, multiprocessing_distributed=False, net='hubert_base', net_from_name=False, num_classes=10, num_eval_iter=2048, num_labels=100, num_log_iter=256, num_train_iter=4096, num_warmup_iter=5120, num_workers=4, optim='AdamW', overwrite=True, pretrain_path='', rank=0, resume=False, sample_rate=16000, save_dir='./saved_models/usb_audio', save_name='supervised_urbansound8k_100_0', seed=0, train_sampler='RandomSampler', ulb_dest_len=7079, ulb_imb_ratio=1, ulb_loss_ratio=1.0, ulb_num_labels=None, uratio=1, use_aim=False, use_cat=False, use_pretrain=False, use_tensorboard=True, use_wandb=False, weight_decay=2e-05, world_size=1) [2023-10-22 14:37:35,883 INFO] Resume load path ./saved_models/usb_audio/supervised_urbansound8k_100_0/latest_model.pth does not exist [2023-10-22 14:37:35,883 INFO] Model training [2023-10-22 14:37:52,823 INFO] 256 iteration USE_EMA: False, train/sup_loss: 2.2898, train/run_time: 0.0554, lr: 0.0000, train/prefecth_time: 0.0010 [2023-10-22 14:38:09,517 INFO] 512 iteration USE_EMA: False, train/sup_loss: 2.2459, train/run_time: 0.0534, lr: 0.0000, train/prefecth_time: 0.0010 [2023-10-22 14:38:26,149 INFO] 768 iteration USE_EMA: False, train/sup_loss: 2.0217, train/run_time: 0.0554, lr: 0.0000, train/prefecth_time: 0.0011 [2023-10-22 14:38:42,741 INFO] 1024 iteration USE_EMA: False, train/sup_loss: 1.6001, train/run_time: 0.0532, lr: 0.0000, train/prefecth_time: 0.0010 [2023-10-22 14:38:59,719 INFO] 1280 iteration USE_EMA: False, train/sup_loss: 1.1751, train/run_time: 0.0536, lr: 0.0000, train/prefecth_time: 0.0008 [2023-10-22 14:39:16,388 INFO] 1536 iteration USE_EMA: False, train/sup_loss: 0.6881, train/run_time: 0.0555, lr: 0.0000, train/prefecth_time: 0.0012 [2023-10-22 14:39:33,051 INFO] 1792 iteration USE_EMA: False, train/sup_loss: 0.4191, train/run_time: 0.0574, lr: 0.0000, train/prefecth_time: 0.0012 [2023-10-22 14:39:49,708 INFO] validating... [2023-10-22 14:39:51,540 INFO] confusion matrix: [[0.19 0. 0.11 0.07 0.06 0.16