Open jackhu-bme opened 1 month ago
I see that the accessionNo seems to be the index number of patients, I am currerntly trying to fix the bug by converting the volumename to accessionNo.
I referenced your code
/home/***/baselines/CT-CLIP/scripts/data.py, line77
accession_number = nii_file.split("/")[-1]
It seem reasonable to get the missing accessionNo by converting your nii file name.
If this one is not working or is wrong, please tell me. By the way, this is a bug to be fixed, or makes trouble to others using this dataset and repo.
Currently I am trying to reproduce your train-from-scratch CLIP results.
I have pre-processed the volume data downloaded from huggingface repo (using provided data processing scripts) and downloaded all the csv files. However, when I use the scripts/run_train.py, I encounter error of missing values.
File "/home/***/baselines/CT-CLIP/scripts/CTCLIPTrainer.py", line 188, in __init__ self.ds = CTReportDataset(data_folder=data_train, csv_file=reports_file_train) File "/home/***/baselines/CT-CLIP/scripts/data.py", line 43, in __init__ self.accession_to_text = self.load_accession_text(csv_file) File "/home/***/baselines/CT-CLIP/scripts/data.py", line 66, in load_accession_text accession_to_text[row['AccessionNo']] = row["Findings_EN"],row['Impressions_EN']
*** is my name.
I have double checked the files provided in the huggingface repo, including
Since then, I guess the file downloaded from huggingface repo: "https://huggingface.co/datasets/ibrahimhamamci/CT-RATE/tree/main/dataset/radiology_text_reports" train_reports.csv misses the column "'AccessionNo".
Is this guess true? Or did I miss anything?
Thanks a lot for your time, and really thankful for your open source code and datasets.