how to generate tumor_train.txt

LXYTSOS commented 6 years ago

how to generate the tumor_train.txt if I want to use my own dataset?

yil8 commented 6 years ago

@LXYTSOS This is the tricky part. Usually it includes two steps. The first step is easier, it starts with obtaining the tissue mask of each tumor slide. Then I randomly sample coordinates based on the tumor tissue mask and determine it's label based on the annotation. But only random sampling is not good enough, you have to include hard negative patches, and the second step is for this purpose. But the code for this hard negative mining part is messy, and it involves training a first model, obtain false tumor regions with high probability, and save them as coordinates. I'll see if there are many people demanding this functionality, I'll spare some time implement it.

yangyang117 commented 4 years ago

请问如下理解对吗？ 1.通过对包含肿瘤的图像进行mask操作，获得里面的正常组织mask，将这部分进行切片作为训练样本，同时补充无肿瘤图像的切片，使样本均衡。 2.训练一个模型，将错误分类的图片进行分析，将错误分类的图片进行二次训练。在这个过程中，是否需要对肿瘤边缘的图片进行判定？因为在level6层面大于一定的阈值判定为有肿瘤，小于一定的阈值判定为无肿瘤，这些数据会不会对模型造成很大的干扰？

yil8 commented 4 years ago

@yangyang117 2. 对于错分的图片，我并没有额外去判断是否是处于边缘，但是确实发现统计上更容易出现在边缘

baidu-research / NCRF

how to generate tumor_train.txt #14