len_img - Githubissues

Rokinluohhh commented 6 months ago

Can I ask the blogger if the len_img column in dataset_csv is from script/merge_img_wsi.py after mergering every 200 patches?What is train_5x_list.pickle file?A reply would be greatly appreciated！

jingweizhang-xyz commented 6 months ago

Hi, the len_img is the number of 200 patches of a WSI. E.g. if a WSI contains 900 patches, we first crop them into 900 patches and then merge every 200 patches into a single image. It results in 5 images and the len_img = 5 in this case.
"train_5x_list.pickle" stores the patches of the training dataset. We use pickle style to distinguish between train, valid and test in our previous projects. But for this project we use the provided csv file for simplicity. The train/valid/test separation can be found in the csv files and you do not need the pickle files.
The merging process was purely for data loading efficiency and it is actually not necessary. You can follow the naming examples provided in the readme and change the len_img = 900 if you want.

Rokinluohhh commented 6 months ago

Thanks for the reply! So I don't need to run script/merge_img_wsi.py, just start with extract_fit.py?

jingweizhang-xyz commented 6 months ago

Yes, you don't need to run “merge_img_wsi.py”. If you find your training process very slow but the GPU consumption is far from 100%, you may come back to use/rewrite this file as it is when reading patches is a bottleneck. You should start with "main.py". "extract_fit.py" is used to compare with some baselines and it is not involved in our method. You can check the examples in the Training part of the readme.

Rokinluohhh commented 6 months ago

okok,thanks very much!

cvlab-stonybrook / PromptMIL

len_img #2