Open incomplete59 opened 3 hours ago
Oh, I solved this. Create the directory 'checkpoints/' before training. By the way, i find that the urls of CWRU dataset in metadata.txt are outdated, so when i downloaded the dataset, it showed 'hostname mismatch'.
Hi~ When I was finished training for the first epoch and tested on test set, it went wrong.
(zdl_py_310) fengpi@fengpi-07:~/zdl/Few-shot-via-ensembling-Transformer-with-Mahalanobis-distance$ python train_1shot.py --dataset 'CWRU' --training_samples_CWRU 30 --training_samples_PDB 195 --model_name 'Net' /home/fengpi/anaconda3/envs/zdl_py_310/lib/python3.10/site-packages/sklearn/utils/_param_validation.py:11: UserWarning: A NumPy version >=1.23.5 and <2.3.0 is required for this version of SciPy (detected version 1.23.0) from scipy.sparse import csr_matrix, issparse Namespace(dataset='CWRU', training_samples_CWRU=30, training_samples_PDB=195, model_name='Net', episode_num_train=130, episode_num_test=150, way_num_CWRU=10, noise_DB=None, way_num_PDB=13, spectrum=False, device='cuda', batch_size=1, path_weights='checkpoints/', lr=0.001, step_size=10, gamma=0.1, num_epochs=100, loss1=ContrastiveLoss(), loss2=CrossEntropyLoss()) Shape of CWRU train data: torch.Size([30, 1, 64, 64]) Shape of CWRU test data: torch.Size([750, 1, 64, 64]) End Loading CWRU training.........................!! ================================================== Epoch: 1 ================================================== Epoch 1/100: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 130/130 [02:41<00:00, 1.25s/batch, loss=2.49, loss_1=0.982, loss_2=1.51] Epoch 1/100 completed in 2.70 minutes ------------Testing on the test set------------- Accuracy on the test set: 0.8867 Traceback (most recent call last): File "/home/fengpi/zdl/Few-shot-via-ensembling-Transformer-with-Mahalanobis-distance/train_1shot.py", line 210, in
train_and_test_model_ensemble(net,
File "/home/fengpi/zdl/Few-shot-via-ensembling-Transformer-with-Mahalanobis-distance/train_1shot.py", line 197, in train_and_test_model_ensemble
torch.save(net, path_weight + model_name)
File "/home/fengpi/anaconda3/envs/zdl_py_310/lib/python3.10/site-packages/torch/serialization.py", line 376, in save
with _open_file_like(f, 'wb') as opened_file:
File "/home/fengpi/anaconda3/envs/zdl_py_310/lib/python3.10/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/fengpi/anaconda3/envs/zdl_py_310/lib/python3.10/site-packages/torch/serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'checkpoints/Net_1shot_0.8867_30samples.pth'
I don't know where is the model 'Net_1shot_0.8867_30samples.pth', can you help me with this problem? Thanks!