Example clinical table - Githubissues

Rukhmini commented 1 year ago

Hello, Thanks a lot for all your responses. I have one doubt in the clinical table that you have provided as an example. In the table "TCGA.csv" the vital status is similar to OS status, right? Generally, 1 means dead and 0 means alive but in the table is it the opposite in the vital status meaning 1 means alive and 0 means dead? Thanks again.

LiangJunhao-THU commented 1 year ago

Yes, thank you for your question. In the "TCGA.csv" file, 0 refers to death, 1 refers to alive. Therefor we changed the dataloader according to the regular OS status: O = (~self.data['vital_status'][pd_index].astype(bool)).astype(int) We use ~ to make the OS status in regular.

You can also find a more comprehensive survival information in here: https://github.com/KatherLab/cancer-metadata/blob/main/tcga/outcome.xlsx This xlsx file include fine information. When use this, please notice modify the dataloader accordingly :-)

Rukhmini commented 1 year ago

Thank you. Another question is that when the heatmap is cropped into a square one, do you consider the zero values (padded) outside the heatmap as well while training the Macronet?

LiangJunhao-THU commented 1 year ago

The square cut process aims to make the heatmap square for training as well as make the tissue area large in it. So the code first finds empty area in the heatmap to know the tissue border, which gives us a circumscribed rectangle of tissue. Then we padding the rectangle into square according to its largest side length with zero values. Then the square heatmap after cutting is used as MacroNet training data.

https://github.com/Biooptics2021/PathFinder/blob/43da3deaabadd00f0338ef2de6d56dc4855da100/Prognosis/Data_prepare/cut_heatmap.py#L8

Rukhmini commented 1 year ago

Thank you. I could successfully run the training of the Macronet using the probability maps but I wanted to know that how do you save the trained model in the "train_test.py" file? As I am using the code for another application I wanted to save the model and do further analysis. Also, how do you use the saved model for KM plot? I see that you have provided a pre-trained model macronet.pth which is used for your application I want to save the same model for my application. Also, in the Verification.ipynb file how do you get the p-value in "TCGA_plot(TCGA,'TIN10',0.018925978874988)" and what does "TIN10" mean? TIA.

LiangJunhao-THU commented 1 year ago

Congratulations :-)

In train_test.py file, the trained model is saved in a .pkl file with other information: https://github.com/Biooptics2021/PathFinder/blob/43da3deaabadd00f0338ef2de6d56dc4855da100/Prognosis/train_test.py#L122 You can use pickle.load(saved_file_path) to load the .pkl file, which is a dictionary and the key of the trained MacroNet is 'model_state_dict'. The corresponding value is the .pth file you wanted, which can be used for further analysis. Or you can change the save code part in train_test.py file to save the .pth file only. Then you can directly use the trained_model.pth for analysis, just like the Discovery/attribution.ipynb

The p value is calculated by results = logrank_test(T[ix], T[~ix], E[ix], E[~ix], alpha=.99).p_value, ix and ~ix means two different risk groups based on thresold you given.

TIN10 is just TND in the paper. 10 means the cell_size of TND calculation is 10. TND = Co_loc(prob_map = file_path , cell_size = 10) Here I selected 10 as the cell_size. For more discussion about the cell_size, you can read the paper in https://github.com/TissueImageAnalytics/TILAb-Score

Rukhmini commented 1 year ago

Thank you again. How did you choose the best model to be saved as .pth file as you have performed 10 fold cross-validation? Also, when I am trying load your pretrained model it shows an error "ModuleNotFoundError: No module named 'networks'". ANd when I am trying to run attribution.ipynb using my model it shows an error "AssertionError: Target not provided when necessary, cannot take gradient with respect to multiple outputs." in this line Attrmap = saliency.attribute(input_tensor, target=None,abs=False).

LiangJunhao-THU commented 1 year ago

The model with highest C-index was selected as the best model in each fold.
You may check whether the relevant files have been added to the path. Please keep the file structure in the Discovery file. (Also, the name 'networks' may have the same name as some functions in torch, so you can change the names of the file and the import part)
You could check the outputs with your model. The MacroNet in this work is a regression model and only has one output (hazard). The model output of Discovery and Prognosis is slightly different. In Prognosis training it outputs both hazards and last feature (which is for multimode fusion); In Discovery the model only output hazards, for running attribution methods successfully (this change won't affect the trained model, just changed the output number). https://github.com/Biooptics2021/PathFinder/blob/43da3deaabadd00f0338ef2de6d56dc4855da100/Prognosis/Networks/Macro_networks.py#L277 https://github.com/Biooptics2021/PathFinder/blob/43da3deaabadd00f0338ef2de6d56dc4855da100/Discovery/networks.py#L314

Hope these suggestion can be helpful.

Rukhmini commented 1 year ago

Thank you. I could load my trained model and run attribution.ipynb file. How do you calculate "NEC" "TCGA_plot(TCGA,'NEC',0.066076159595081)" for the verification.ipynb file? I don't see the code for that. Also, how do you do the Multi variance analysis? Do you have any reference code for that as well? Thank you again.

LiangJunhao-THU commented 1 year ago

NEC = necrosis area / tissue area. You can write a code based on the segmentation map of WSI: NEC = number of NEC classes pixel / (number of whole input pixel - number of background pixel)

As for Multi variance analysis, you can directly call from lifelines import CoxPHFitter to get the analysis results: https://lifelines.readthedocs.io/en/latest/Survival%20Regression.html

Biooptics2021 / PathFinder

Example clinical table #6