voldemortX / DST-CBC

Implementation of our Pattern Recognition paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"
BSD 3-Clause "New" or "Revised" License
133 stars 17 forks source link

What's the meaning of splits? #7

Closed czb2133 closed 3 years ago

czb2133 commented 3 years ago

Thanks for your hard work!

I am new to this question. Can you explain the meaning of splits in generate_splits.py, like setting [2, 4, 8, 20, 29.75] for cityscapes? I only know that it means the ratio of labeld data and unlabeled data and really don't know why you set those values. Furthermore, if I want to train it on my own data, how can I set this variable according to the ratio of my labeled data and unlabeled data?

Thank you for your help.

voldemortX commented 3 years ago

@czb2133 The reason to set these ratios in splitting data is mainly based on prior works in this field, so I can fairly compare my method with them. for the last ratio however, I use numbers like 29.75 to keep at least 100 labels. Since at the time I believed <100 labels impossible for semi-supervised segmentation, but it have been proved otherwise in recent work reco.

If you have your own data, there are mostly 2 cases.

  1. You have a fully annotated dataset and you want to artificially make it a semi-supervised benchmark. In this case you can set similar ratios as I did for Cityscapes/PASCAL VOC.
  2. You have an actual need for semi-supervised learning, i.e. you have N labeled data and M unlabeled data. In this case, you can just set the txt files by yourself without using generate_splits.py. Examples for such files are provided here.
czb2133 commented 3 years ago

I got it! Thanks a lot!

voldemortX commented 3 years ago

Since the question was resolved a long time ago, this issue is now closed. Feel free to reopen if there are further questions.