Open fatemeh-slh opened 6 years ago
Not the exact same case with you, but in my case 'Loss is NaN' problem is solved by lowering the learning rate (e.g. 0.001 --> 0.0001).
Could your teach me how to make custom coco-like dataset? Thanks!
I finally figured out that if I use the end to end models, I don't need proposal files anymore. In order to make coco-like format, I tried to convert binary masks to polygon and then I used the format which has been discussed in http://cocodataset.org/#download. I don't have any problem with detection tasks on my own data but I still get this error when I want to train Mask RCNN:
0201 18:37:39.300101 56408 context_gpu.cu:325] Total: 2638 MB
I0201 18:37:39.324462 56409 context_gpu.cu:321] GPU 0: 2767 MB
I0201 18:37:39.324494 56409 context_gpu.cu:325] Total: 2767 MB
I0201 18:37:39.354914 56410 context_gpu.cu:321] GPU 0: 2911 MB
I0201 18:37:39.354959 56410 context_gpu.cu:325] Total: 2911 MB
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at blob.h:94] IsType<T>(). wrong type for the Blob instance. Blob contains nullptr (uninitialized) while caller expects caffe2::Tensor<caffe2::CUDAContext> .
Offending Blob name: gpu_0/_[mask]_fcn1_w.
Error from operator:
input: "gpu_0/_[mask]_roi_feat" input: "gpu_0/_[mask]_fcn1_w" input: "gpu_0/_[mask]_fcn1_b" output: "gpu_0/_[mask]_fcn1" name: "" type: "Conv" arg { name: "kernel" i: 3 } arg { name: "exhaustive_search" i: 0 } arg { name: "pad" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "stride" i: 1 } device_option { device_type: 1 cuda_gpu_id: 0 } engine: "CUDNN"
*** Aborted at 1517470659 (unix time) try "date -d @1517470659" if you are using GNU date ***
PC: @ 0x7fc605f0c428 gsignal
*** SIGABRT (@0x8e0700000dc12) received by PID 56338 (TID 0x7fc4d3fff700) from PID 56338; stack trace: ***
@ 0x7fc6062b2390 (unknown)
@ 0x7fc605f0c428 gsignal
@ 0x7fc605f0e02a abort
@ 0x7fc60378684d __gnu_cxx::__verbose_terminate_handler()
@ 0x7fc6037846b6 (unknown)
@ 0x7fc603784701 std::terminate()
@ 0x7fc6037afd38 (unknown)
@ 0x7fc6062a86ba start_thread
@ 0x7fc605fde41d clone
@ 0x0 (unknown)
Aborted
I used cv2.findContours function to obtain polygons from binary masks. Do you have any idea about this error? Is it something related to the JSON format?
Thanks.
Hi, you problem has been solved? I wanna ask how to make a coco-like json file.
Hi,
I have encountered the same issue when train a extended coco-like json data. @faticom ,@xuhuaren Do you fix it?
Thanks.
hi I have done to create my own coco-like josn file, to create your own one, you need make sure your data structure in coco-like.json is same as coco'file and you need write your own code to create it. In original coco's ison fils, it is a big dict which have a few keys: info (which you don't really need this), license( I guess this is not necessary), images, annotations and categories.
So in your coco-like.json file, you need make sure your data structure is dict comes with at least 3 keys which are images, annotations and categories Each key's value is a list, the length of the list depends on how many pictures you have. and each element in the list is another dict.
So make sure your josn file has same structure, I think you should be able to train the model with your own data @xuhuaren @topcomma
@xuhuaren @Devincool Here is my code to create coco-style dataset.
Hi,
I have a new dataset and I prepared coco-like annotations for instance level semantic segmentation task. When I want to train Mask RCNN on my own data, I get following error. I am wondering what is TRAIN.PROPOSAL_FILES in the config file and how should I fill it for a new dataset.
I also tried to train on COCO dataset and again I got the same error. When I comment the PROPOSAL_FILES from config file, it gives me this output:
What is your suggestion? Thanks.