Biooptics2021 / PathFinder

GNU General Public License v3.0
43 stars 4 forks source link

Example clinical table #6

Closed Rukhmini closed 11 months ago

Rukhmini commented 1 year ago

Hello, Thanks a lot for all your responses. I have one doubt in the clinical table that you have provided as an example. In the table "TCGA.csv" the vital status is similar to OS status, right? Generally, 1 means dead and 0 means alive but in the table is it the opposite in the vital status meaning 1 means alive and 0 means dead? Thanks again.

LiangJunhao-THU commented 1 year ago

Yes, thank you for your question. In the "TCGA.csv" file, 0 refers to death, 1 refers to alive. Therefor we changed the dataloader according to the regular OS status: O = (~self.data['vital_status'][pd_index].astype(bool)).astype(int) We use ~ to make the OS status in regular.

You can also find a more comprehensive survival information in here: https://github.com/KatherLab/cancer-metadata/blob/main/tcga/outcome.xlsx This xlsx file include fine information. When use this, please notice modify the dataloader accordingly :-)

Rukhmini commented 1 year ago

Thank you. Another question is that when the heatmap is cropped into a square one, do you consider the zero values (padded) outside the heatmap as well while training the Macronet?

LiangJunhao-THU commented 1 year ago

The square cut process aims to make the heatmap square for training as well as make the tissue area large in it. So the code first finds empty area in the heatmap to know the tissue border, which gives us a circumscribed rectangle of tissue. Then we padding the rectangle into square according to its largest side length with zero values. Then the square heatmap after cutting is used as MacroNet training data.

https://github.com/Biooptics2021/PathFinder/blob/43da3deaabadd00f0338ef2de6d56dc4855da100/Prognosis/Data_prepare/cut_heatmap.py#L8

Rukhmini commented 1 year ago

Thank you. I could successfully run the training of the Macronet using the probability maps but I wanted to know that how do you save the trained model in the "train_test.py" file? As I am using the code for another application I wanted to save the model and do further analysis. Also, how do you use the saved model for KM plot? I see that you have provided a pre-trained model macronet.pth which is used for your application I want to save the same model for my application. Also, in the Verification.ipynb file how do you get the p-value in "TCGA_plot(TCGA,'TIN10',0.018925978874988)" and what does "TIN10" mean? TIA.

LiangJunhao-THU commented 1 year ago

Congratulations :-)


In train_test.py file, the trained model is saved in a .pkl file with other information: https://github.com/Biooptics2021/PathFinder/blob/43da3deaabadd00f0338ef2de6d56dc4855da100/Prognosis/train_test.py#L122 You can use pickle.load(saved_file_path) to load the .pkl file, which is a dictionary and the key of the trained MacroNet is 'model_state_dict'. The corresponding value is the .pth file you wanted, which can be used for further analysis. Or you can change the save code part in train_test.py file to save the .pth file only. Then you can directly use the trained_model.pth for analysis, just like the Discovery/attribution.ipynb


The p value is calculated by results = logrank_test(T[ix], T[~ix], E[ix], E[~ix], alpha=.99).p_value, ix and ~ix means two different risk groups based on thresold you given.


TIN10 is just TND in the paper. 10 means the cell_size of TND calculation is 10. TND = Co_loc(prob_map = file_path , cell_size = 10) Here I selected 10 as the cell_size. For more discussion about the cell_size, you can read the paper in https://github.com/TissueImageAnalytics/TILAb-Score

Rukhmini commented 1 year ago

Thank you again. How did you choose the best model to be saved as .pth file as you have performed 10 fold cross-validation? Also, when I am trying load your pretrained model it shows an error "ModuleNotFoundError: No module named 'networks'". ANd when I am trying to run attribution.ipynb using my model it shows an error "AssertionError: Target not provided when necessary, cannot take gradient with respect to multiple outputs." in this line Attrmap = saliency.attribute(input_tensor, target=None,abs=False).

LiangJunhao-THU commented 1 year ago

Hope these suggestion can be helpful.

Rukhmini commented 1 year ago

Thank you. I could load my trained model and run attribution.ipynb file. How do you calculate "NEC" "TCGA_plot(TCGA,'NEC',0.066076159595081)" for the verification.ipynb file? I don't see the code for that. Also, how do you do the Multi variance analysis? Do you have any reference code for that as well? Thank you again.

LiangJunhao-THU commented 1 year ago

NEC = necrosis area / tissue area. You can write a code based on the segmentation map of WSI: NEC = number of NEC classes pixel / (number of whole input pixel - number of background pixel)

As for Multi variance analysis, you can directly call from lifelines import CoxPHFitter to get the analysis results: https://lifelines.readthedocs.io/en/latest/Survival%20Regression.html