Open clw5180 opened 5 years ago
@clw5180 Go check if you having a consistent class number in your .yml file. :)
@clw5180 Go check if you having a consistent class number in your .yml file. :)
Thanks a lot! It's a problem of pytorch/torchvision's version and I try lots of times, torchvision=0.2.1 and pytorch=1.1 finally works.... also I have a question, what does '__C.MODEL.ROI_REC_HEAD.NUMCLASSES = 99' mean ? If I have 18 classes + 1 background, how to set this parameter ? 这个参数是什么含义,需要根据自己数据集的物体类别数量进行改动么,非常感谢 @mjq11302010044
@clw5180,@mjq11302010044 , I start training on my dataset and it runned in a right way before iter 690, and after that it make errors like this. Have you ever encountered this and may be nan error is related to my problem, can you leave your email address to have further communication?
I also start training on dota dataset, and I find it works normally before 230 iters,but it appears nan until 230 iters.It shows error that RuntimeWarning: invalid value encountered in greater overlaps[overlaps > 1.00000001] = 0.0
overlaps > 1.00000001
解决办法: 1、删除小于16x16(或者保险起见8x8)的bbox; 2、在代码中找到T.RandomRotation,注释掉。
I also start training on dota dataset, and I find it works normally before 230 iters,but it appears nan until 230 iters.It shows error that RuntimeWarning: invalid value encountered in greater overlaps[overlaps > 1.00000001] = 0.0
解决办法: 1、删除小于16x16(或者保险起见8x8)的bbox; 2、在代码中找到T.RandomRotation,注释掉。 可以参考一下我的github:https://github.com/clw5180/remote_sensing_object_detection_2019 @Baby47 @oceanleftsea
@clw5180 Go check if you having a consistent class number in your .yml file. :)
Thanks a lot! It's a problem of pytorch/torchvision's version and I try lots of times, torchvision=0.2.1 and pytorch=1.1 finally works.... also I have a question, what does '__C.MODEL.ROI_REC_HEAD.NUMCLASSES = 99' mean ? If I have 18 classes + 1 background, how to set this parameter ? 这个参数是什么含义,需要根据自己数据集的物体类别数量进行改动么,非常感谢 @mjq11302010044
感谢dalao,装对pytorch和torchvision版本真的很重要!
具体情况如下: loss: nan (nan) loss_classifier: 0.2950 (1.0475) loss_box_reg: 0.0002 (0.0087) loss_objectness: nan (nan) loss_rpn_box_reg: nan (nan)
求各位大佬指点迷津,感激不尽! @mjq11302010044
下面是我使用的bbox格式:
boxes.append([x_ctr, y_ctr, width, height, angle, words]) 分别为x中心坐标,y中心坐标,宽、高,角度,类别,不知是否是这种坐标形式
im_info格式:
im_info = { 'gt_classes': gt_classes, 'max_classes': max_classes, 'image': im_path, 'boxes': gt_boxes, 'flipped': False, 'gt_overlaps': overlaps, 'seg_areas': seg_areas, 'height': im.shape[0], 'width': im.shape[1], 'max_overlaps': max_overlaps, 'rotated': True }
还有DOTA的类别:
cls_list = \ { 'background': 0, 'roundabout': 1, 'tennis-court': 2, 'swimming-pool': 3, 'storage-tank': 4, 'soccer-ball-field': 5, 'small-vehicle': 6, 'ship': 7, 'plane': 8, 'large-vehicle': 9, 'helicopter': 10, 'harbor': 11, 'ground-track-field': 12, 'bridge': 13, 'basketball-court': 14, 'baseball-diamond': 15, 'helipad': 16, 'airport': 17, 'container-crane': 18 }
DATASET = { 'IC13':get_ICDAR2013, 'IC15':get_ICDAR2015_RRC_PICK_TRAIN, 'IC17mlt':get_ICDAR2017_mlt, 'LSVT':get_ICDAR_LSVT_full, 'ArT':get_ICDAR_ArT, 'ReCTs':get_ICDAR_ReCTs_full, 'DOTA':get_DOTA, # clw modify } _DEBUG = False class RotationDataset(torch.utils.data.Dataset): CLASSES = ( "__background__ ", #"background", "roundabout", "tennis-court", "swimming-pool", "storage-tank", "soccer-ball-field", "small-vehicle", "ship", "plane", "large-vehicle", "helicopter", "harbor", "ground-track-field", "bridge", "basketball-court", "baseball-diamond", "helipad", "airport", "container-crane" )
any warnings during your training? I encountered the same issue alongside with the warning that I should change my torch.uint8
type to torch.bool
for indexing and I changed it in the add_visibility_to
function and the nan loss is gone as well.
@clw5180 Go check if you having a consistent class number in your .yml file. :)
Thanks a lot! It's a problem of pytorch/torchvision's version and I try lots of times, torchvision=0.2.1 and pytorch=1.1 finally works.... also I have a question, what does '__C.MODEL.ROI_REC_HEAD.NUMCLASSES = 99' mean ? If I have 18 classes + 1 background, how to set this parameter ? 这个参数是什么含义,需要根据自己数据集的物体类别数量进行改动么,非常感谢 @mjq11302010044
I also encountered this problem, and I changed my version of pytorch and torchvision in the same with you, but I still have this problem. Do you have any other suggestions, thanks a lot.
I meet the same question and I find it beacuse the target is too small to match the min_size 800 seted by default, finally I solve it by change the _C.INPUT.MIN_SIZE_TRAIN in maskrcnn_benchmark/config/defaults.py.
具体情况如下: loss: nan (nan) loss_classifier: 0.2950 (1.0475) loss_box_reg: 0.0002 (0.0087) loss_objectness: nan (nan) loss_rpn_box_reg: nan (nan)
求各位大佬指点迷津,感激不尽! @mjq11302010044
下面是我使用的bbox格式:
im_info格式:
还有DOTA的类别: