[Discuss]About introducing the lightGBM and post-processing

Hi there,

I am really thankful for your sharing at CrowdAI.

I myself is now in the industry of Satellite imagery. And our team has studied the ML solution of this for more than a year now. Our unet solution for our clients can also reach 0.93+ (IoU >= 0.5) result at present with some morphology post-processing (not for this CROWDAI competiton, but for our own business project with our own datasets).

And so far, we are exploring a way to improve the quality of the outputs. Say, one of the most significant difference between masks produced by CNN and Man-made labels is the "straightness". For example, CNN masks usually have round corners(or curve contour) while man-made labels usually are standard rectangles(90-degree corner and straight outlines).

We are now searching for solution for this problem. Currently, two main directions are post-processing and output ensembling. Same as yours, we also introduced CRF post-processing in our solution, however, the quality of the CRF-enabled outputs varies a lot. Some of them are extremely straight and perfectly match our GT, while others are not that good.

So, we recently started a research about result ensemble tech. One of our current idea is that we use LightGBM on top of several of our different model outputs and CRF-enabled outputs. Since while "CRF-enabled" results have more noise and error-classification on pixels, it also represent a more "stragiht" style on the masks. So maybe, the ensemble of outputs from CRF and other models can help.

As i noticed that you guys mentioned "lightgbm-based" post-processing, actually we want hear more about your ideas about how to utilize lightgbm on this problem so that maybe we could work together to solve this problem?

Regards Ziyi

Hi @ziyigogogo I am really happy that you noticed our work!

Quick brag/disclaimer: it turns out that with different post-processing params (erode:1 dilate:2) + TTA + score heuristics we were able to get 0.9428 precision 0.9542 recall. Without heuristics (that we want substitute with lgbm) we get 0.936 precision.

In our current solution we are not using CRF since it was very slow (do you know of any good/fast implementation). What is the boost from CRF that you are getting?

Now on the subject of using lgbm, our procedure is:

apply thresholds to the unet output (after TTA) from 0.05 all the way up to 0.95 producing 19 binary images.
label objects on those binary images
for each object extract features (based on shape and probability values in the masked region)
train lgbm that predicts IOU for each of those objects with the ground truth
predict IOU (via lgbm) for each object extracted for each threshold and based on that information apply non max suppresion with set threshold (chosen via CV procedure) to get the best objects (and get the scores from the masked region for those objects) (6. apply weights based on probability threshold)

Currently lgbm step is done with heuristics but we should have it very soon.

What is important is that our approach doesn't (unfortunately) deal with the round vs rectangle problem. That being said my idea to adjust our approach to deal with that would be the following:

after step 2. apply a set of morphological transformations with different selem parameters that make corners/edges stronger and run it through lgbm predicting IOU
adjust/add loss to lgbm that penalizes objects that are not cornery/edgy
train a network that predicts IOU minimizing mse and some auxilary loss that penalizes corner regions more ( you can use our distance weighted cross entropy loss implementation but simply have "corner mask"). Input to this network would simply be a one channel with masked probabilities that belong to the object region (potentially cropped and resized to make training more efficient)
train a network that takes probabilty map per masked region and produces binary map wich is minimized with soft dice + corner/edge weighted cross entropy.

I will get back to this as soon as we have lgbm implementation open for you to see and discuss further.

Hi,@jakubczakon,

Our team had a little break for the past 3 days, sorry for the late response.

Thanks a lot for your detail procedure and deep thought for my question, which is absolutely an inspiration for me. I will discuss this post with my teammates today and maybe do some quick implementation based on this. Once we draw any progress, i will post back.

For your question "Do you know of any good/fast implementation".

Since I saw your CRF related code here line284-302. I believe we are using the same package(pydensecrf) which maybe the only open-source package available now. So the answer is no, we currently dont have a faster crf implementation. Recently a paper introduced a conv method for CRF--the authors claim a 100x speed-up . However we did not try it so far, since it seems that the boost from end-to-end trainable CRF network is limited and changing the entire network architecture for such a little improvement is not worth for now.

Another question "What is the boost from CRF that you are getting". Actually i have described this above, maybe you lost me due to my poor standard of English, sorry : )

Let me directly use an to explain, as you can see, for this very example, by applying CRF to our network prediction(white), we got a more "rectangle-like" mask(red). But still a lot to go because the quality of CRF post-processing varies a lot as i mentioned(it only works good on a part of our test set). That's why our team wants to utilize some tree-like networks(maybe lgbm) helping us pick the best result from multiple techniques(like a combination of crf-enabled output, mask-rcnn output and unet output).

Finally, looking forward for your up-coming release. Thanks a lot!

Regards

@ziyigogogo merged lgbm (and random forest) second level models to master with #152

neptune-ai / open-solution-mapping-challenge

[Discuss]About introducing the lightGBM and post-processing #136