chenxin-dlut / TransT

Transformer Tracking (CVPR2021)
GNU General Public License v3.0
566 stars 104 forks source link

训练问题,训练进行几次后,会出错,出现nan值,导致AssertionError!!! #127

Open xiaofengBian opened 2 years ago

xiaofengBian commented 2 years ago

C:\Users\bxf\anaconda3\envs\transt\python.exe C:/PyCharmProjects/TransT-main/ltr/run_training.py Training: transt transt WARNING: You are using tensorboardX instead sis you have a too old pytorch version. loading annotations into memory... Done (t=13.20s) creating index... index created! number of params: 23016006 No matching checkpoint file found [train: 1, 1 / 1000] FPS: 0.0 (0.0) , Loss/total: 12.99988 , Loss/ce: 0.69430 , Loss/bbox: 0.97997 , Loss/giou: 1.15687 , iou: 0.03106 [train: 1, 2 / 1000] FPS: 0.0 (5.1) , Loss/total: 13.18990 , Loss/ce: 0.67882 , Loss/bbox: 1.01086 , Loss/giou: 1.23913 , iou: 0.01553 [train: 1, 3 / 1000] FPS: 0.0 (5.1) , Loss/total: 13.00681 , Loss/ce: 0.69773 , Loss/bbox: 0.93112 , Loss/giou: 1.26818 , iou: 0.01083 [train: 1, 4 / 1000] FPS: 0.0 (5.3) , Loss/total: 12.93164 , Loss/ce: 0.69913 , Loss/bbox: 0.91258 , Loss/giou: 1.27109 , iou: 0.01094 [train: 1, 5 / 1000] FPS: 0.0 (4.9) , Loss/total: 12.94410 , Loss/ce: 0.69589 , Loss/bbox: 0.91288 , Loss/giou: 1.29008 , iou: 0.00936 [train: 1, 6 / 1000] FPS: 0.0 (5.1) , Loss/total: 12.90344 , Loss/ce: 0.69371 , Loss/bbox: 0.90170 , Loss/giou: 1.30679 , iou: 0.00780 Training crashed at epoch 1 Traceback for the error! Traceback (most recent call last): File "C:\PyCharmProjects\TransT-main\ltr\trainers\base_trainer.py", line 70, in train self.train_epoch() # 调用ltr/trainers/ltr_trainer.py写的train_epoch方法 File "C:\PyCharmProjects\TransT-main\ltr\trainers\ltr_trainer.py", line 79, in train_epoch self.cycle_dataset(loader) # 调用自己写的cycle_dataset方法 File "C:\PyCharmProjects\TransT-main\ltr\trainers\ltr_trainer.py", line 60, in cycle_dataset loss, stats = self.actor(data) # 跳转到ltr/actors/tracking.py里面 File "C:\PyCharmProjects\TransT-main\ltr\actors\tracking.py", line 44, in call loss_dict = self.objective(outputs, targets) # 跳转到ltr/models/tracking/transt.py的182行的forward方法,用于计算损失 File "C:\Users\bxf\anaconda3\envs\transt\lib\site-packages\torch\nn\modules\module.py", line 550, in call result = self.forward(*input, **kwargs) File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 204, in forward losses.update(self.get_loss(loss, outputs, targets, indices, num_boxes_pos)) File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 180, in get_loss return loss_map[loss](outputs, targets, indices, num_boxes) File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 153, in loss_boxes box_ops.box_cxcywh_to_xyxy(target_boxes)) File "C:\PyCharmProjects\TransT-main\util\box_ops.py", line 52, in generalized_box_iou assert (boxes1[:, 2:] >= boxes1[:, :2]).all() AssertionError

ChenJian7578 commented 1 year ago

请问一下解决了吗?请问如果想要自己训练的话,数据集路径和格式应该怎么放置?