Closed guohahah closed 7 months ago
Hello @guohahah, thank you very much for your interest!
The data_folder
variable should point to the directory containing the preprocessed images, rather than the metadata CSV file. Initially, please download the validation volumes using the provided script available here (download_only_valid_data.py): https://github.com/sezginerr/example_download_script.
To access the data from Hugging Face, you must agree to the terms and conditions (I believe you have already done this as you downloaded the csv files), obtain a personal token (from the settings of huggingface), and then set this token within the script to initiate the data download process. This will retrieve the validation dataset for you.
Subsequently, execute the preprocessing script found here: https://github.com/ibrahimethemhamamci/CT-CLIP/tree/main/data_preprocess. The usage instructions for these scripts are detailed in the provided link.
Following preprocessing, ensure to update the data_folder
variable to reflect the directory path (or symbolic link) of the preprocessed volumes.
I hope these instructions are clear. Feel free to reach out if you have any further questions or require assistance.
Thank you very much for your detailed reply, I'll give it a try!
Hi @guohahah, I am closing the issue for now. You can reopen it if you have further questions.
Hello! I want to try using your zero-shot model on my own data, but I met some problems when I run run_zero_shot.py, with the pre-training .pt file set up, I set the
data_folder = '/dataset_metadata_validation_metadata.csv', reports_file= "dataset_radiology_text_reports_validation_reports.csv", labels = "dataset_multi_abnormality_labels_valid_predicted_labels.csv",
which these three .csv file downloaded from your huggingface dataset. However, it seems the dataloader in zero_shot.py cannot right read the data and throws this error as bellow:
$ CUDA_VISIBLE_DEVICES=2 python run_zero_shot.py /.conda/envs/ct_clip/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( /.conda/envs/ct_clip/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or
inference = CTClipInference(
File "/CT-CLIP-main/scripts/zero_shot.py", line 179, in init
self.dl = DataLoader(
File "/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 351, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/.local/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 107, in init
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
None
for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=VGG16_Weights.IMAGENET1K_V1
. You can also useweights=VGG16_Weights.DEFAULT
to get the most up-to-date weights. warnings.warn(msg) 0it [00:00, ?it/s] Traceback (most recent call last): File "run_zero_shot.py", line 43, inHow to solve this problem? I'm not sure if there are errors in any of the settings, and if I want to use the model directly to diagnose new CT cases, is it just a matter of running run_zero_shot.py as I'm currently doing?