Closed sjwhhhi closed 4 years ago
I'm facing the same error but in windows, and I really need help about how to solve it.
Same here. Training on Mask RCNN Inception on Ubuntu using Python 2.7 and Tensorflow 1.5. Get the same error.
https://github.com/tensorflow/models/issues/3972 also describes the same problem, as the assertion is seen in the trace here too.
The error is not seen if fast_rcnn is used with the same configuration file. but then we are trying to get maskrcnn working and not fast rcnn.
@priya-dwivedi : Same error occurred. I am following your blog Custom_Mask_Rcnn and trying to train the model and got stuck in this error. Were you able to fix it ? If yes, please guide me through this
I had this problem, I solved as follow:
The name of the TFRecords files should be pet_train/val.record
. I changed it by editing the faces_only
from True
to False
check the line here https://github.com/tensorflow/models/blob/master/research/object_detection/dataset_tools/create_pet_tf_record.py#L49
Then, I regenerated TFRecord files by this
python object_detection/dataset_tools/create_pet_tf_record.py
--label_map_path=object_detection/data/two_label_map.pbtxt
--data_dir=`pwd` --output_dir=`pwd` --include_masks=True
Then, I got two TFRecords files with names pet_train/val.record, then I used them for training process with mask_rcnn_inception_v2_coco
Hope this helps
Getting same error, any update on this?
This error is caused by the data.
InvalidArgumentError (see above for traceback): assertion failed: [] [Condition x == y did not hold element-wise:] [x (Loss/BoxClassifierLoss/assert_equal_2/x:0) = ] [0] [y (Loss/BoxClassifierLoss/assert_equal_2/y:0) = ] [1]
This line is trying to find the contour of the mask. An error here probably means the mask is not included or at least not found.
This may root in the creating of the tf-record
.
You need manually check the create_tf_record.py
line by line for errors.
My lucky guess would be the string constants.
flags = tf.app.flags
flags.DEFINE_string('data_dir', '', 'Root directory to raw pet dataset.')
flags.DEFINE_string('output_dir', '', 'Path to directory to output TFRecords.')
flags.DEFINE_string('label_map_path', 'data/pet_label_map.pbtxt',
'Path to label map proto')
flags.DEFINE_boolean('faces_only', False, 'If True, generates bounding boxes '
'for pet faces. Otherwise generates bounding boxes (as '
'well as segmentations for full pet bodies). Note that '
'in the latter case, the resulting files are much larger.')
flags.DEFINE_string('mask_type', 'png', 'How to represent instance '
'segmentation masks. Options are "png" or "numerical".')
FLAGS = flags.FLAGS
And then train with this fix https://github.com/tensorflow/models/pull/4462/commits/e45234e32dbc485f74567f6c0297edc9c084677c in config.
This fix tells the trainer to read in PNG masks.
Please let me know if the fix works. The PR #4462 is still pending.
@quxiaofeng Unfortunately I've gotten other issue now (I've changed faces_only from True to False and got files: pet_train/val.record. And also smoothed images (data) to remove noises I've tried train with it and got somethink like this:
And also I can got smth like this:
It's awful...
@Abduoit Can you say please why here (https://github.com/tensorflow/models/blob/master/research/object_detection/dataset_tools/create_pet_tf_record.py#L49) we need in several TFrecords (flags.DEFINE_integer('num_shards', 10, 'Number of TFRecord shards')) ? I used only one train.record and one val.record (My create_pet_tf_record.py don't include line 'flags.DEFINE_integer('num_shards', 10, 'Number of TFRecord shards')' ). Is it crucially to get several records for train and for val (test)? P.S. faces_only = False
nonbackground_indices_x = np.any(mask_np == 1, axis=0)
nonbackground_indices_y = np.any(mask_np == 1, axis=1)
Please answer
@FreedoomFighter It is just an Out of Memory error. Your GPU does not have enough memory for this model.
@quxiaofeng Thank you. I wish you happiness
@quxiaofeng The last question please. I've trained model (during the training process I see that box and mask were training) but in jupyter notebook I can only observe box without mask..What is wrong? All I did as I wrote above (my pre-previous comment)
If you run the evaluation correctly, you should see the masked result images.
Another possible error is that the output you use does not output the mask. Maybe you could verify the output tensor or the module in the graph for the exact data getting out.
Hi,
i have some issues, first mask is not displayed i use the create_pet_tf_record.py (i put var faces_only to False) and in train-?????-of-00010.record i have image/object/mask in all tf records(i think it is good)
but when i launch train, i can only see the boundboxes and not the mask.
i have /mask/images (jpg) mask/annotations/trimaps(png) mask/annotations/xmls(xml) mask/annotation/trainval.txt mask/test_images
when i launch detection or eval i don't see masks why ? and there is not mask graph.
where is mask graph ? on train tensorboard ? or eval tensorbard. Can we launch tensorboard with both train and eval data in one tensorboard ?
please i really need help.. 2 weeks on training with no mas data
i become crazy because i don't see mask on evaluation .... have we to launch a specific command to add mask in train ? evaluation ? detection ?
@quxiaofeng i could not see mask in train or eval all configuration as advice
@leccyril check once whether mask information is getting saved in .record file or not. If it is getting saved attach create_tf_record file, I would like to look at it.
hi, yes i attach the document, there is mask object path into ! num_shard is 10. so i join only first files.
i don't know what to do , i don't want to change the way to make it work (matterplot or coco dataset)...
check whether your config file is correct. In order to get mask you have to use this . also I asked for script that you used for creating .record files
ok i will do this, i use exactly the same file sauv.zip
you have file xml/png/jpg sample.
i really want to thank you because i am lost....
i pulled the new architecture tensorflow models with eval and train in legecy folder and tried with the pet dataset sample ... and the problem still occurs. no mask is displayed !!!
this i a great tool i think it miss one configuration or one parameter ....
One precision installation on debian 9 with pip3 python 3.5 and CPU installation
in your mask_rcnn_inception_v2.config file ther is one line missing ' number_of_stages: 3' .this line helps in processing mask. check the link link
I will try it now an tell you in the day. You think only this configuration can make change ?
Do you know what is the difference in have only one record file (shard1) and have 10 (shard 10) ?
thank you very much for your time
Wonderful, i can see mask in first step evaluation. how i miss this configuration.
chapeau bas !
the mask is green , i created mask in yellow and only outline because i want only see the shape outline. Do you know how i can do ? it is not automatic when add the mask we created before ?
thank you very much i spent severa days to make it work...
i not see the mask loss graph in eval tensorboard, it is normal ? thanks
No, its not normal. You should get something in mask loss graph
On Mon 16 Jul, 2018, 5:56 PM leccyril, notifications@github.com wrote:
i not see the mask loss graph in eval tensorboard, it is normal ? thanks
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/3913#issuecomment-405230553, or mute the thread https://github.com/notifications/unsubscribe-auth/Ad_VkYGCZi1nHpGiLw8xY-cZQI4fzDz3ks5uHIaLgaJpZM4TL8gf .
How mix train data and eval data in tensorboard ? it is strange i see the bounding boxed filled completely by mask.... but not the png i have specified as mask and in tensorboard there is no mask_loss when i launch the eval.py script. any idea ? my files and configuration seems to be ok ? just the stage_evaluation was missing ?
i think it persists a problem because moreover there is not the mask loss... the boundind boxe are fullefully with the green color not really a mask
ok it work i see loss... but mask cover entierly the bounding box
@leccyril You may need to refer to the format of mask under the Oxford IIIT Pet data set
Hi There, We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing. If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.
getting the same error INFO:tensorflow:Error reported to Coordinator: assertion failed: [Condition x == y did not hold element-wise:] [x (Loss/BoxClassifierLoss/assert_equal_3/x:0) = ] [1067 800] [y (Loss/BoxClassifierLoss/assert_equal_3/y:0) = ] [800 1067]
I want to train a mask-rcnn models by my personal dataset. I use create_pascal_tf_record.py to make it in tf-format. However, I cannot train it with this error.
And my tensorflow-gpu vesion is 1.5 in Ubuntu16. Could anyone help me? Thanks.