1.首先请问加载预训练模型是在main.py文件中的这修改吗?
parser.add_argument('--resume_train', type=str, default='model_50.pth', help='Weights resumed in training')
我在加载了model_50.pth之后训练过程中却报错:
loaded weights from model_50.pth, epoch 50
Traceback (most recent call last):
File "main.py", line 60, in
ctrbox_obj.train_network(args)
File "/home/bishe/guo/BBAVectors-master/train.py", line 91, in train_network
strict=True)
File "/home/bishe/guo/BBAVectors-master/train.py", line 70, in load_model
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/optim/optimizer.py", line 123, in load_state_dict
raise ValueError("loaded state dict contains a parameter group "
ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group
请问作者这个问题如何解决呢?
2.未加载预训练模型,从头开始训练
训练过程中loss一直为nan未出现改变,后面直接报错:
nohup: ignoring input
Let's use 2 GPUs!
Setting up data...
Starting training...
Epoch: 1/80
Traceback (most recent call last):
File "main.py", line 60, in
ctrbox_obj.train_network(args)
File "/home/bishe/guo/BBAVectors-master/train.py", line 132, in train_network
criterion=criterion)
File "/home/bishe/guo/BBAVectors-master/train.py", line 160, in run_epoch
for data_dict in data_loader:
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 363, in next
data = self._next_data()
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data
return self._process_data(data)
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
data.reraise()
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
data = fetcher.fetch(index)
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/bishe/guo/BBAVectors-master/datasets/base.py", line 257, in getitem
image_h, image_w, c = image.shape
AttributeError: 'NoneType' object has no attribute 'shape'
请问作者这个问题是怎么回事呢?
希望作者有空可以解答,非常感谢!!!!
1.首先请问加载预训练模型是在main.py文件中的这修改吗? parser.add_argument('--resume_train', type=str, default='model_50.pth', help='Weights resumed in training')
ctrbox_obj.train_network(args)
File "/home/bishe/guo/BBAVectors-master/train.py", line 91, in train_network
strict=True)
File "/home/bishe/guo/BBAVectors-master/train.py", line 70, in load_model
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/optim/optimizer.py", line 123, in load_state_dict
raise ValueError("loaded state dict contains a parameter group "
ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group
请问作者这个问题如何解决呢?
2.未加载预训练模型,从头开始训练
训练过程中loss一直为nan未出现改变,后面直接报错:
nohup: ignoring input
Let's use 2 GPUs!
Setting up data...
Starting training...
我在加载了model_50.pth之后训练过程中却报错: loaded weights from model_50.pth, epoch 50 Traceback (most recent call last): File "main.py", line 60, in
Epoch: 1/80 Traceback (most recent call last): File "main.py", line 60, in
ctrbox_obj.train_network(args)
File "/home/bishe/guo/BBAVectors-master/train.py", line 132, in train_network
criterion=criterion)
File "/home/bishe/guo/BBAVectors-master/train.py", line 160, in run_epoch
for data_dict in data_loader:
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 363, in next
data = self._next_data()
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data
return self._process_data(data)
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
data.reraise()
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
data = fetcher.fetch(index)
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/bishe/anaconda3/envs/BBAVector/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/bishe/guo/BBAVectors-master/datasets/base.py", line 257, in getitem
image_h, image_w, c = image.shape
AttributeError: 'NoneType' object has no attribute 'shape'
请问作者这个问题是怎么回事呢?
希望作者有空可以解答,非常感谢!!!!