Open canerozer opened 6 years ago
I have been working on it for a few days but I keep getting NaN and zero valued losses. I might have some bug in my ResNeXt implementation.
I am currently trying transferring the weights of FAIR version of Mask R-CNN, maybe if I become successful in that, I may provide you the pretrained version of these weights.
model file for ResNet 101 here: https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
Implementation done on my side (ResNext-50 cardinal = 32m ResNext-101 cardinal = 32). Testing it. Will update you if it works
@ericj974 any updates? I sucessfully trained the model using a VGG16 and VGG19 backbone
@BugaDM what was the accuracy, training and inference times of VGG16 and VGG19 compared to original?
@valikund I just did a small training on my own dataset to check if the backbone was working. I cannot give you any reliable metric, but VGG16 and VGG19 results were worse in terms of accuracy, mAP... The training time was better with both VGGs than with ResNet101. Inference time was the same (around 0.2 seconds with images of size 1024x1024).
I'm trying to implement some more backbones and carry out some full trainings with all of them. I'll update you when I finish.
@ericj974 I tested your model with coco_weights
and image net weights
, however I got the errors:
ValueError: Layer #9 (named "res2a_branch2b") expects 0 weight(s), but the saved weights have 2 element(s)
. Could you tell me how you configure when you test? Thanks a lot
@ericj974 : Thanks for your code. Could you tell me the performance of ResnetXt in comparison with Resnet 101 and Resnet 50?
@BugaDM , what about smaller backbones like MobilNet? That should have faster inference, at the cost of acc/map.
@ericj974 How did you deal with the weights? I got similar errors like @chenyuZha got. Any progress?
@paulcx @chenyuZha sorry for my late reply. Actually you cannot load the h5 file (mask_rcnn_coco.h5) since it assumes a standard resnet and not resnext as backend encoder. You will have to do the training from scratch.
@ericj974 : What is your performance of resnetxt ? How do you train from scratch? I tried the command python3 coco.py train --dataset=/path/to/coco/
(deleted --model=coco
) but it got error
if args.model.lower() == "last":
AttributeError: 'NoneType' object has no attribute 'lower'
@ericj974 Have you tried the other backbone models like inception resnet v2?
Haha I actually did my first run with the mobilenet backbone right before @23pointsNorth had mentioned it. I can confirm that it works from scratch and imagenet pretrained mobilenet weights, though still trying to get the quality of predictions on par with the resnet50 backbone
Hi @Cpruce I am trying to use MobileNET backbone as well. It includes a layer called Depthwise Separable Convolution. Keras offers seperable2dconv layer. Did you just use the seperable2dconv layer or add anything else while modelling the backbone ?
Hi @gsujansai
I saw the seperable2dconv layer in Keras but didn't try it out since I pulled the one from the keras models. This one creates its own _depthwise_conv_block
which contains the pointwise convolutions at the end of each.
@waleedka I have a memory leak but besides that my new backbone works. Shall I open a pull request?
@Cpruce Thank you
For everyone interested, check out my pull request https://github.com/matterport/Mask_RCNN/pull/306
Let me know if you have any improvements and feel free to contribute!
@Cpruce Hello I saw your implementation of maskRcnn with mobilenet224 backbone, very interesting! Do you think it's possible to integrate the model into mobile devices for inference ? (like mobilenet always do)..
@chenyuZha :D yes definitely! I have (somewhat) gotten results on my mobile phone, though it is still pretty slow. You will face a few obstacles to overcome but there is much to do after getting the model to load. Let me know if you start going down this road :)
@Cpruce I actually implemented a very similar MobileNet 224 backbone, it appears to be the same as the one I saw on your repository. Did you try training it on lower resolution images, perhaps (224, 224, 3). If you train it on lower res, and you run inference on lower res it will drastically increase the speed. Although I am still getting NaN errors in rpn_bbox_loss
when trying to do so.
@JonathanCMitchell Cool! nope I haven't tried lower resolution yet. I've still got a few tricks I want to try before, since the average precision and recall still aren't as good. Can you show evaluation results and images with the instance segmentation?
I am still trying to get rid of NaN errors on rpn_bbox_loss when lowering the image dimensions. I have a thread here 321
@JonathanCMitchell moving our conversation to your thread
@Cpruce When you test with your phone, it's with Android studio or Xcode that you integrate your *.pb ?( I guess that you have to convert .h5 to .pb). I have tried to integrate a model(CNN) trained with inception v3 to Android studio and it worked well , but when I tried to use my own custom model pb, it didn't work any more.. I guess that I should modify something in Android Studio? Like input tensor or output tensor.. If you have any idea that will be very very appreciate !!f
Yup! Could you post the summary of your model? You'll get a message telling you all the arguments to use. However, note that the input_layer_shape that it tells you may be wrong if you have multiple inputs
@Cpruce Sorry for reply so late.. In fact I have tested with TF DETECT and TF CLASSIFIER, which finally worked( I made a stupid error before..). But the fact is , for the TF DETECT, I could only use the pre-trained model like mobilenetV1(because il sounds very tricky to modify the input or output node in java script..). I really want to test something new (like Deeplab V3 +mobilenet V2). But since then I haven't found any information of implementation of the script.. As you actually work in mobilenet of MaskRcnn, have you already tested in android studio with this model? Maybe you have some more experience in Android studio..
@ericj974 I have a problem of the resneXt network. In fact I have already trained a model with resneXt and convert .h5 to .pb successfully.(with your file export_model.py) . But when I tried to do the inference with this .pb, I got error of tf.py_func
: UnknownError (see above for traceback): KeyError: 'pyfunc_0' [[Node: mrcnn_detection/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], Tout=[DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/device:CPU:0"](ROI/packed_2/_67, mrcnn_class/Reshape_1/_69, mrcnn_bbox/Reshape/_71, _arg_input_image_meta_0_1)]]
. It seems that when we freeze the graph, the tf.py_func
will cause the problem with certain operations. Could you telle me if you have the same issue? How do you solve it ? Thanks for your reply!
@chenyuZha Yes, I have loaded the model on my phone/android studio and have gotten it to run inference. However, predictions are still too slow and I need to work on how I feed/preprocess the image. For your 2nd problem, could you try with the latest master? The last instance of py_func
was removed in my first pull request https://github.com/matterport/Mask_RCNN/pull/167
@Cpruce Yes I updated the script and now I can run the inference !! Thanks for your help! For the part of mobilenet MaskRCNN, have you tried to reduce the number of boundingboxes of second stage? MayBe il will speed up the running process.
@chenyuZha Awesome 😄 Nope I haven't tried reducing the number of proposals/detections but I'm sure it will speed up inference. How much is another story though. Have you tested my fork by any chance? You can also test your hypothesis via the master repo
@Cpruce Yes I'll test your model, by the way for the part of Android studio, I should do some modifications so that I could load the masks I guess?
@chenyuZha cool let me know how it goes and yes, there should be extra pre and post processing steps for Android. Are you using tensorflow mobile?
@Cpruce yes I did all of installation from this tutorial :https://www.tensorflow.org/mobile/android_build Then for my case I modified the graph of demo app TF Detect
@ericj974 Have you tested resneXt101 cardinality=32? I've downloaded your code 2 months ago (the file model.py is not available in your git repository now)... But as I've already downloaded, when I re-check the function resnet_graph
, I found that you didn't change the num of filters
in each stage(which remains the same as the resnet50). For example:
``
x = conv_block(x, 3, [64, 64,256], stage=2, block='a', strides=(1, 1), cardinality = cardinality)
x = identity_block(x, 3, [64, 64,256] stage=2, block='b', cardinality = cardinality)
C2 = x = identity_block(x, 3, [64,64, 256], stage=2, block='c', cardinality = cardinality)
``
But in the paper, it's something like this:
``
x = conv_block(x, 3, [128, 128,256], stage=2, block='a', strides=(1, 1), cardinality = cardinality)
x = identity_block(x, 3, [128, 128,256] stage=2, block='b', cardinality = cardinality)
C2 = x = identity_block(x, 3, [128,128, 256], stage=2, block='c', cardinality = cardinality)
``
@paulcx @chenyuZha sorry for my late reply. Actually you cannot load the h5 file (mask_rcnn_coco.h5) since it assumes a standard resnet and not resnext as backend encoder. You will have to do the training from scratch.
please how to do the training from scratch?
Hi, did someone succeed at implementing ResNext 101 backbone in Mask Rcnn Matterport implementation please ? I think that with this backbone we could have better results than resnet 101. I am "newbie" in the computer vision world and i dont have the strengh to implement a backbone from scratch.
Hello @ericj974 could you please share your Backbond ResNet to Resnext changing implementation? and I want to know how improved performance after changing this Backbone.
@valikund I just did a small training on my own dataset to check if the backbone was working. I cannot give you any reliable metric, but VGG16 and VGG19 results were worse in terms of accuracy, mAP... The training time was better with both VGGs than with ResNet101. Inference time was the same (around 0.2 seconds with images of size 1024x1024).
I'm trying to implement some more backbones and carry out some full trainings with all of them. I'll update you when I finish.
Hey!!!
Have you uploaded these backbones to your Github?
I've added you to your Linkedin, I would appreciate it if you accept the invitation.
Thanks!
Hello @ericj974 could you please share your Backbond ResNet to Resnext changing implementation? and I want to know how improved performance after changing this Backbone.
Have you done that?
Has anyone implemented the ResNeXt backboned version of Mask R-CNN and tested the results?