Hello,
I am implementing your code to train another model on another dataset.
However, when I only put the train set (without the test set) in the "data" folder.
The error below appeared:
=> DO WE NEED TO USE TEST SETS IN TRAINING PHASE?
CAN WE USE THE MODEL TO CLEAN ANOTHER TEST SETS?
$ python3 train.py --data_dir /scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0
INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes.
INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes.
Namespace(batch_size=1, cut_len=32000, data_dir='/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/', decay_epoch=30, epochs=20, init_lr=0.0005, log_interval=500, loss_weights=[0.1, 0.9, 0.2, 0.05], save_model_dir='./saved_model')
['Tesla V100-SXM2-16GB', 'Tesla V100-SXM2-16GB']
Traceback (most recent call last):
File "train.py", line 298, in
mp.spawn(main, args=(world_size, args), nprocs=world_size)
File "/home/thinh/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/thinh/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/home/thinh/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/thinh/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/train.py", line 288, in main
args.data_dir, args.batch_size, 2, args.cut_len
File "/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/dataloader.py", line 60, in load_data
test_ds = DemandDataset(test_dir, cut_len)
File "/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/dataloader.py", line 18, in init
self.clean_wav_name = os.listdir(self.clean_dir)
FileNotFoundError: [Errno 2] No such file or directory: '/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/test/clean'
Hello, I am implementing your code to train another model on another dataset. However, when I only put the train set (without the test set) in the "data" folder. The error below appeared: => DO WE NEED TO USE TEST SETS IN TRAINING PHASE?
$ python3 train.py --data_dir /scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/ INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. Namespace(batch_size=1, cut_len=32000, data_dir='/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/', decay_epoch=30, epochs=20, init_lr=0.0005, log_interval=500, loss_weights=[0.1, 0.9, 0.2, 0.05], save_model_dir='./saved_model') ['Tesla V100-SXM2-16GB', 'Tesla V100-SXM2-16GB'] Traceback (most recent call last): File "train.py", line 298, in
mp.spawn(main, args=(world_size, args), nprocs=world_size)
File "/home/thinh/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/thinh/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/home/thinh/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error: Traceback (most recent call last): File "/home/thinh/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, *args) File "/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/train.py", line 288, in main args.data_dir, args.batch_size, 2, args.cut_len File "/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/dataloader.py", line 60, in load_data test_ds = DemandDataset(test_dir, cut_len) File "/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/dataloader.py", line 18, in init self.clean_wav_name = os.listdir(self.clean_dir) FileNotFoundError: [Errno 2] No such file or directory: '/scratch/thinh/CMGAN/20240227_Backup_alarm_5dB/CMGAN/src/data/test/clean'