Closed AlexNmSED closed 1 year ago
hi, no problem 👍 because I cannot attach files to Github messages, here are 3 simple steps describing how I came up with the labels used for TCGA binary classification:
tcga_brca_subset.csv
(here)oncotree_code
column via the following functiondef map_otc_to_int(oncotree_code: str, missing_label: int = -1):
if oncotree_code == 'IDC':
return 0
elif oncotree_code == 'ILC':
return 1
else:
return missing_label
doing so, you should end up with 837 case_id
mapped to 875 slide_id
and the following label counts:
0
: 7261
: 149let me know it this helps
Thank you for your help. It's a very read-friendly job that inspires me a lot.
Thanks for sharing, a very kind job. The information I found at TCGA somewhat conflicts with what the original HIPT repository mentions. Can you provide the label file for training.