yinghdb / EmbedMask

EmbedMask: Embedding Coupling for One-stage Instance Segmentation
MIT License
161 stars 19 forks source link

Performance #5

Open trungpham2606 opened 4 years ago

trungpham2606 commented 4 years ago

@yinghdb

When trying to train your great work on my new dataset. I realized that the detection part is not as good as fcos ( Iam sure because I had trained also fcos with the same structure as yours on that dataset).

So according to it, I have 2 questions:

  1. Which parameters should I change to get better performance of embedmask ( both detection + mask).
  2. I had tried trained fcos on my dataset first, then loaded it to embedmask structure, freezed them all, loaded your pretrained embedmask weights to the rest of the network's parameters and trained them only. The very first result I got is way better than the first method. But it's quite time consuming.

I appreciate if you can give me some advices for my training !

yinghdb commented 4 years ago

Indeed, the detection performance of EmbedMask is not as good as FCOS. The main difference in training is that for the positive samples which are used to predicting classes and bounding boxes for each ground-truth object, EmbedMask requires them laying inside the mask of the object, while FCOS not. By discarding the requirement, EmbedMask can achieve better detection results (the same as FCOS), but worse mask results.

So, there is a conflict in positive sampling in detection part. I have not figured out a better solution for the problem as far. It should be a point that could be improved. For example, you can try to define another ignore samples besides the positive samples and negative samples.

For your second question, do you mean that the EmbedMask combined with the frozen FCOS and the finetuned mask part can produce better results than the directly trained one? And which result it improved, detection part, mask part, or both? I am also curious about how much does it improve in detail. I hope you can give more detailed information, and tell me the specific point you want to ask.

I hope these replies can help you, and I look forward to your progress.

trungpham2606 commented 4 years ago

@yinghdb Here are some of my results after training with the second way: 1> Normal way: image image image image

2> Second way: image image image image

As you can see that, the second way of training gains better in both detection and mask on my dataset (I was assigning all objects to 1 class).

yinghdb commented 4 years ago

@trungpham2606 From the visualized results, it is obvious that the second way is better. But it may be more fair for the comparison if increasing the training epochs of the normal way, for example, double the training epochs (and the 'STEP' as well).

Since you are using another dataset, you can try to modify the parameters like the learning rate or loss weight, which do not need additional implementation, to try to get better performance. But it also need quite a few experiments.

And I am curious that have you tried other instance segmentation method, like mask rcnn.

trungpham2606 commented 4 years ago

@yinghdb I had tried so many instance segmentation methods before.(for example: HTC, maskrcnn, Yolact, Yolact++, ...) I also added mask branch to CenterNet but the results are way worse than trying with your network. The detection part of FCOS is also better than CenterNet. My main purpose is trying to detect all objects on an image ( it's acceptable if the masks of objects are not perfect)

yinghdb commented 4 years ago

@trungpham2606 If so, you can try setting 'MODEL.EMBED_MASK.SAMPLE_IN_MASK' False. It will make the detecting result as good as FCOS, then you can check whether the mask results are acceptable.

trungpham2606 commented 4 years ago

@yinghdb Oh thanks a lots for your advice. I will try it right now!

yinghdb commented 4 years ago

@trungpham2606 Oh, remind that when using different datasets, you should be careful about the size distribution of objects. It is also an important factor.