binli123 / dsmil-wsi

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image
MIT License
358 stars 88 forks source link

Wondering the TCGA test data label #56

Closed XiaoXueShengwangrui closed 1 year ago

XiaoXueShengwangrui commented 2 years ago

Example img in your Repo image

In this image has shown the line 3 about slide of case TCGA-64-5778-01Z , which label is 0 (negative bag= normal case). However, in the GDC data portal, the sample type has shown is Primary Tumor . image

So, is there any mistake I understand? Look forward to your reply!

binli123 commented 1 year ago

Yes, there is. This dataset contains two subtypes of tumors. 0 is tumor type 0 (LUAD), and 1 is tumor type 1 (LUSC). They are both tumors. Please check the readme -> Feature vector csv files explanation -> 3.Labels.

binli123 commented 1 year ago
  1. Labels.

    For binary classifier, use 1 for positive bags and 0 for negative bags. Use --num_classes=1 at training.
    For multi-class classifier (N positive classes and one optional negative class), use 0~(N-1) for positive classes. If you have a negative class (not belonging to any one of the positive classes), use N for its label. Use --num_classes=N (N equals the number of positive classes) at training.

XiaoXueShengwangrui commented 1 year ago

Ok, got it, thank u very much !