HKU-MedAI / WSI-HGNN

[CVPR'23] Histopathology Whole Slide Image Analysis with Heterogeneous Graph Representation Learning
66 stars 6 forks source link

some problems of get_graph.py #12

Open hbuwls opened 2 months ago

hbuwls commented 2 months ago

Hello, I am a graduate student in a university. After reading your paper, I am impressed that your articles and codes are very excellent, and I want to try to learn some knowledge of drawing construction. However, due to personal ability, there are some problems in the step of get.graph.py. I would like to ask you about the./data/biomedical_data/normal_list.txt file in get_graph.py and data/biomedical_data/normal_list_BRCA.txt and data/clinical _data/staging.txt file is the name sequence file of the data set, Other in configs/GraphConstruction/BRCA_HovernetKimia_graph_constructor yml file hovernet_data_root refers to what path, I'm very sorry my my problem is a bit much, But I am very eager to study under your guidance. I hope you can spare your precious time to help me answer my questions. Thank you very much.

howardchanth commented 2 months ago

Hi thanks for your appreciation of our work. The lists file refer to the labels that we refer to them foe the labels when constructing the graph - such that we can perform five folded CV with balanced classes. Then hovernet root refers to the pre-trained hovernet weights, where you can download the weights from the original repo of Hovernet.

hbuwls commented 2 months ago

Thank you very much for your guidance, I will try my best, thank you for taking the precious time to reply me, thank you very much

hbuwls commented 2 months ago

Thank you very much for your help before, but I am very sorry that I may still have some questions that I do not understand. Since I only know a little about the TCGA data set, I would like to ask you again. If I want to train the cancer classfication task of the BRCA data set, what should the root directory format of the tcga data set be like? Because it is a classification task, the BRCA_trainval function is called. Therefore, the format of the normal_list_BRCA.txt file is TCGA-C8-A1H1,normal. I am sorry that I do not know whether this is correct. In addition, type_info_path in hovernet_config in BRCA_HovernetKimianet_graph_constructor.yml file is the path to store what file, I can understand it as json file, but I don't know its specific content. In addition, for some questions you helped me last time, does hovernet_data_root in graph_constructor refer to the weight path of hover? I don't know much about this, so there is a setting indicating hovernet in the following hovernet_config. Can I understand that this is the root folder for training hovernet? I am very sorry to bother you again. I love and appreciate your work very much and look forward to your advice. I look forward to your reply! I wish you a happy life and smooth work!

howardchanth commented 2 months ago

Hi sorry for the late reply. The files in ./data present examples on the normal_list. Specifically it's in the format TCGA-C8-A1H1, {label_info}. You can let it to be TCGA-C8-A1H1,normal but you need to address this format when loading the data in data.py. Please refer to #10 #9 for more information. It seems that type_info.json is deprecated and no longer in use. Please ignore this for now and let me know if there is any further problem. Yes hovernet_data_root: "./data/hovernet_json" pointing the directory storing the weights of HoverNet. Thanks again for using our work and please let us know if there is any further issue.

hbuwls commented 1 month ago

Hello, I'm sorry to ask you some questions again. I want to learn more about the difference between GraphConstruction, configs, and BRCA, COAD, and ESCA. In addition, I would like to know the difference between HovernetEfficient and HovernetKimia in GraphConstruction. As mentioned in the paper, HovernetKimia refers to the two networks of hovernet and kimianet, while HovernetEfficient is just a kind of network of Hovernet? I am very sorry to bother you again. I am very grateful to you for your help before. I hope your team will get better and better.

howardchanth commented 1 month ago

Hi. HovernetEfficient is our another experiments using Efficientnet b4. Since the pretrained Efficientnet is on natural images, we found the performance (on encoding patch features) is not as good as using the KimiaNet. Sorry that I don't understand your first question. Maybe you could make it a bit more clear what difference you want to learn. Thanks

hbuwls commented 3 weeks ago

First of all, thank you very much for your reply. Secondly, I would like to apologize to you for my unclear expression, which not only wasted your time, but also came to nothing. The question I wanted to ask you last time was really simple, I'd like you to walk me through the feature_dim, radius, n_channel, verbose parameters in the configs\GraphConstruction\BRCA_HovernetKimia_graph_constructor.yml file Count. I am very sorry that I did not express myself clearly last time, and thank you for your continuous advice.

howardchanth commented 3 weeks ago

The feature_dim is the dimension of each patch, where the feature encoder compress the patch into this dimension, and the features would serve as the graph feature. The radius in the number of neighbours for KNN, which is called in the following codes.

image

n_channel is the number of channel, which is 3 for colored images and 1 for black and white images. Verbose is a boolean variable to determine whether we print the debugging messages.

hbuwls commented 1 week ago

Good evening and thank you for your reply. In the previous days, I tried to replicate your whole experiment process with the camelyon16 dataset, and after trying, I can now construct the diagram successfully. After reading your paper and the project, I kept trying to run train.py until I had a few more questions here. First, I would like to ask about the details in BRCA/HEAT4_kimia_classification_v2.yml: train_path: "./data/BRCA_kimia_lv0/5fold_balanced/fold_4/train.txt" eval_path: "./data/BRCA_kimia_lv0/5fold_balanced/fold_4/test.txt" valid_path: "./data/BRCA_kimia_lv0/5fold_balanced/fold_4/val.txt" These three paths are text files that store training set verification set and test set, but if I use camelyon16 for verification, there will be no these files. In addition, after I successfully build the drawing for the first time, checkpoints/HEAT4_BRCA_Kimia_lv0_balanced_cls_f4 files do generate checkpoints/ Heat4_brCA_KIMIa_LV0_Balanced_CLS_F4 files, but not during subsequent rebuilds. In addition, I would like to ask you whether it is OK to reproduce camelyon16 data set. If not, what is the difference between Camelyon16 data set and other data sets such as TCGA-BRCA in the reproduction experiment? Thank you very much for your generous advice.