Luffy03 / VoCo

[CVPR 2024] VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis
Apache License 2.0
133 stars 10 forks source link

What changes should be made to test only one data set: BTCV? #9

Closed mzccccccc closed 1 month ago

mzccccccc commented 4 months ago

Hi, thank you for this great work! I would like to ask you some questions about how to reproduce, if I only want to split the BTCV data set, what should I change, and where in the code should I change the data set address to the address I downloaded。

Luffy03 commented 4 months ago

Hi, many thanks for your attention to our work, and sorry for my late reply. If you want to split new data set, you should write a new json file as https://github.com/Luffy03/VoCo/blob/main/Finetune/BTCV/dataset/dataset_0.json. If you want to change the data address, you should change the argument "data_dir" as https://github.com/Luffy03/VoCo/blob/94ed426bec328b7b9b5ddcf25b43fc14f27672ab/Finetune/BTCV/main.py#L52

mzccccccc commented 4 months ago

Dear author, thanks for your reply, I still have some questions:

  1. If I had downloaded the Pre-trained model: VoCo_10k.pt, would I not have run voco_train.py?
  2. I found the "Load Pre-trained weight" code in README. I should run this code first, add the VoCo_10k.pt path to it, and then run the code in Finetune, right?
Luffy03 commented 4 months ago
  1. Yes. If you have the pre-trained checkpoint, you don't need to run voco_train.py again.
  2. Yes. Your understanding is correct. And we will update more advanced checkpoints at https://www.dropbox.com/home/VoCo recently.
mzccccccc commented 4 months ago

Thank you for your reply. Sorry to bother you again. I have a few more questions:

  1. In the code of "Load Pre-trained weight", what should --roi_x,y,z be set to

2.“ If you don't have enough storage, you can change it back in "utils/data_utils.py". How can I change it? Replace the PersistentDataset with a Dataset?

Luffy03 commented 4 months ago
  1. Commonly, we set 'x, y, z' as [96, 96, 96]. According to your GPU storage, you can set 64, 128, or more (need to be multiples of 32, at least 64).
  2. Yes, you are right. 'CacheDataset' and 'SmartCacheDataset' can be faster, but PersistentDataset is the best (if you have enough storage). You can refer to https://docs.monai.io/en/stable/data.html#dataset for details.
mzccccccc commented 4 months ago

Thank you very much for your reply. Sorry to bother you again. I run the code from val.pt in BCTV, first showing no attribute 'cache_dir' then I add a directory to it, and then running it again shows: KeyError: 'image_meta_dict'. It seems that the dictionary does not contain data of type 'image_meta_dict'. What should I do?

Luffy03 commented 4 months ago

Sorry, it is a bug that I forgot to revise it. https://github.com/Luffy03/VoCo/blob/8ccc7a4bc545b5ddbfff26ac8076418aa5ae76c8/Finetune/BTCV/utils/data_utils.py#L102 If we use cache then we may not use 'totensor', since 'totensor' will tranform the data from dict to tensor. Would you please delete 'totensor' in train_ and val_transform and try again?

mzccccccc commented 4 months ago

Still displayed: KeyError: 'image_meta_dict' Are there any special requirements for the 'cache_dir' path setting? I randomly set a path for it

Luffy03 commented 4 months ago

'cache_dir' is a local device path to store the data after processing. After revising https://github.com/Luffy03/VoCo/blob/8ccc7a4bc545b5ddbfff26ac8076418aa5ae76c8/Finetune/BTCV/utils/data_utils.py#L102, you need to delete the data in this path and re-cache the data again. You can change 'PersistentDataset' to 'Dataset' if you don't want to cache.

mzccccccc commented 4 months ago

I deleted the data in the 'cache_dir' path and rerun val.py, still showing: KeyError: 'image_meta_dict'. Am I missing any steps? I used VoCo_10k.pt to train in main.py in BTCV, and the generated model.pt was run in val.pt. The PersistentDataset was not changed

Luffy03 commented 4 months ago

Are there any problems when using 'Dataset'? Would you please share your 'data_utils.py'?

mzccccccc commented 4 months ago

data_utils.zip val.zip I downloaded the BCTV dataste from Hugging Face,No problems occur when running trainer.py vovo1

mzccccccc commented 4 months ago

thanks a lot ,sorry can only upload zip

Luffy03 commented 4 months ago

Seems no problem with the codes. What's the version of your MONAI? Have you downloaded the packages according to https://github.com/Luffy03/VoCo/blob/main/requirements.txt? Would you please try to use 'Dataset' instead of 'PersistentDataset' and see whether there is a problem?

mzccccccc commented 4 months ago

I output batch_data in val.py , and data in metatensor is all 0, which should be wrong?maybe there is a mistake in my train? voco2 voco3

my MONAI version is 1.3.1 when i replace 'PersistentDataset' ,will appear TypeError: init() got an unexpected keyword argument 'pickle_protocol'

Luffy03 commented 4 months ago

Please refer to the usage of 'Dataset' first, I have provided the link https://docs.monai.io/en/stable/data.html#dataset. https://github.com/Luffy03/VoCo/blob/8ccc7a4bc545b5ddbfff26ac8076418aa5ae76c8/Finetune/BTCV/utils/data_utils.py#L146 (please delete pickle_protocol and cache_dir)

Metatensor can be zero, since we did scale-intensity https://github.com/Luffy03/VoCo/blob/8ccc7a4bc545b5ddbfff26ac8076418aa5ae76c8/Finetune/BTCV/utils/data_utils.py#L80. Less than a_min (like background) will be zero.

mzccccccc commented 4 months ago

thanks a lot ,Thank you for your patient response

Luffy03 commented 4 months ago

No problem, feel free to raise any issues, and thanks a lot for pointing out our bugs.

mzccccccc commented 4 months ago

I‘ve replaced 'PersistentDataset' with 'Dataset',still showing: KeyError: 'image_meta_dict'. but i do the following:
for idx, batch_data in enumerate(val_loader): print(f"batch_data keys: {batch_data.keys()}") showing: batch_data keys: dict_keys(['image', 'label', 'foreground_start_coord', 'foreground_end_coord']) There's no 'image_meta_dict' in the result. That's not ture,is it?

Luffy03 commented 4 months ago

Seems it is due to the monai version, 'val.py' is written with the old version of MONAI. I will look into it and fix the bug. The reason is in https://github.com/Luffy03/VoCo/blob/c0b663de47250201daaf08fd3e21c911a69e41d6/Finetune/BTCV/val.py#L124 You can delete this line since we don't need img_name for validation. I believe it will work.

mzccccccc commented 4 months ago

thanks a lot ,it work.I have one more questions:about segmrntation image visualization i use Matplotlib for visualization,like this: 112

my background is blank, not like the image shown in your paper, the background is real input image, 113

do you have any suggestions for this?

Luffy03 commented 4 months ago

I merge the image with the label using the Image package, as follows: image Maybe it is not the best choice.

mzccccccc commented 4 months ago

Thank you very much.

mzccccccc commented 4 months ago

Dear author, I would like to seek your guidance regarding that : Designing different downstream tasks in Finetune, such as BTCV and MM-WHS, both segmentation tasks, so what is the design difference? Seems to use the same segmentation model, loss function.

Luffy03 commented 4 months ago

Dear author, I would like to seek your guidance regarding that : Designing different downstream tasks in Finetune, such as BTCV and MM-WHS, both segmentation tasks, so what is the design difference? Seems to use the same segmentation model, loss function.

The pre - processing (scale intensity,spacing) can be different. I split them to different projets for flexibility.

mzccccccc commented 4 months ago

Do you have any experience or suggestions on how to set up pre - processing for datasets of different parts?

Luffy03 commented 4 months ago

Good question! we are still working on it.It is really difficult to find the best setting and it is important to the performance.Maybe nnunet is a better solution due to it's data fingerprint setting. Now we set scale intensity[-175, 250] for abdomen and head neck [0,1700] for heart [-500,1000] for chest. for spacing and size it should be depended by specific datadet

mzccccccc commented 4 months ago

Thanks a lot. That would be very helpful!

Luffy03 commented 1 week ago

Dear researchers, our work is now available at Large-Scale-Medical, if you are still interested in this topic. Thank you very much for your attention to our work, it does encourage me a lot!