Public datasets and Baselines

PengyiZhang / MIADeepSSL

This page is for A Survey on Deep Learning of Small Sample in Biomedical Image Analysis.

https://arxiv.org/abs/1908.00473

28 stars 5 forks source link

Public datasets and Baselines #1

Open JunMa11 opened 5 years ago

JunMa11 commented 5 years ago

Dear @PengyiZhang ,

Thanks for sharing the great work.

Are there any benchmarks for SSL? e.g., baseline and specific SSL dataset. Moreover, how do you define the small sample in your title? in other words, how much data can be regarded as small sample for a task? I guess it may depend on specific tasks. e.g. for liver ct segmentation, we can achieve test set Dice 90+ with only 20 training cases. More details can be found in CHAOS.

There are lots of public challenges here. In your opinion, which ones belong to SSL?

Anyway, we really need some benchmarks in SSL.

I'm looking forward to your reply. Best, Jun

PengyiZhang commented 5 years ago

Dear @JunMa11 ,

Thank you for your interest and sharing.

We are working on building such baseline models and invite interested researchers to make contributions together to this open source projects sincerely.

As for small sample, i suppose there doesn't exist an absolute number to measure it. Of course it depends on specific tasks. We survey MIADeepSSL techniques that are expected to effectively support the application of deep learning in clinical biomedical image analysis, and further improve the analysis performance, especially when large-scale annotated samples are not available.

We are glad to continue our discussions on these issues.

JunMa11 commented 5 years ago

Dear @PengyiZhang ,

I'm looking forward to your baseline.

For popular medical image segmentation tasks, I've never seen SSL (except Data Augmentation) can beat U-Net variants when using the same training set in recent public challenge datasets (the test set label must be hidden).

If you find such work, please let me know.

In some sense, I think all the public medical segmentation challenges belong to SSL because the training set is really small compared with imagenet.

PengyiZhang commented 5 years ago

Dear @JunMa11 ,

Thank you again for your interest and sharing.

In most cases, transfer learning techniques (one of our surveyed MIADeepSSL techniques) can always help.

In existing MIA (medical image analysis) challenges, the datasets and tasks are usually fixed and constrained. It means that participants cannot get involved in dataset build process. Active learning techniques can largely reduce annotation cost and achieve comparable or superior results.

When you only have medical images with image-level annotations but you need to realize lesion localization or even lesion segmentation, weakly supervised learning techniques can help.

Anyway, we survey MIADeepSSL techniques that are expected to relieve the labeling pressure and reduce annotation cost.

I am glad to continue our discussions on these topics.

JunMa11 commented 5 years ago

Dear @PengyiZhang thanks for your reply. Yes, I agree that transfer learning techniques and weakly supervised learning really help.

In the 5th Miscellaneous techniques, Specifically, what I'm concerned is that does attention mechanism really help in medical image segmentation tasks?

I've never seen attention mechanism can beat U-Net variants (no attention) in recent public segmentation challenges. The experiments of pancreas segmentation paper in your review is not solid. Recently, the results of MICCAI 2019 kidney segmentation challenge have been released. Some participants also use attention module, but do not get top results, the same to miccai 2018 brats.

Anyway, it is really a great survey that review the SOTA methods which can relieve the labeling pressure and reduce annotation cost. I rally enjoy reading your paper and discussing with you.