ncoudray / DeepPATH

Classification of Lung cancer slide images using deep-learning
489 stars 212 forks source link

Per_slide statistics #24

Closed PradeepMoturi closed 5 years ago

PradeepMoturi commented 5 years ago

We took 250 SVS images from TCGA. We used a 3Class classifier for training for 50000 steps, validating. The loss reached to 1.4.

After the testing, when I run script for ROC curves, I got a file named out2_perSlideStats.txt in output folder in which lines looks like this :

screenshot from 2018-12-01 22-18-34

$ test_TCGA-66-2792-11A-01-TS1.fb255c48-b47f-45b1-9f04-5107b8c16e4e_0100 true_label: [1.0, 0.0, 0.0] Percent_Selected: 0.301299 0.002597 0.696104 Average_Probability: 0.306132 0.245624 0.448244 tiles#: 385.000000

$ test_TCGA-55-6975-11A-01-TS1.9bd5efc7-f279-4150-a410-19fb057f9df8_0100 true_label: [1.0, 0.0, 0.0] Percent_Selected: 0.343254 0.000000 0.656746 Average_Probability: 0.345403 0.228446 0.426151 tiles#: 504.000000

$ test_TCGA-67-3773-01A-01-BS1.2e8279d9-57a3-4343-8e8b-0fa7e500a531_0010 true_label: [0.0, 1.0, 0.0] Percent_Selected: 0.110599 0.000000 0.889401 Average_Probability: 0.162894 0.306063 0.531043 tiles#: 434.000000

Percent_Selected for LUAD always remained 0 except for one(second) slide. What does Percent_Selected actually convey and is it correct to get an output like this? How is the final prediction calculated for a slide and which program calculates it?

ncoudray commented 5 years ago

Percent_Selected is the percentage of tiles where the probability is the highest. 0 means that for this given slide, no tile has its max prob assigned to class 2.

Not sure I understand your second question. out_filename_stats shows the prob. assigned to each tile as calculated by inception.