Closed DeepLearningCoder closed 2 years ago
try specifying --split all
I think the issue is that the eval script by default looks for the "test" split in the csv file you provide. For external test set, where the you need to evaluate on cases in the entire csv file, specifying --split all will use the whole dataset.
Now I'm getting an error that says it can't find image1.pt but I don't have anything named image1.pt in the dataset csv. Here is my output:
Traceback (most recent call last):
File "eval.py", line 135, in <module>
model, patient_results, test_error, auc, df = eval(split_dataset, args, ckpt_paths[ckpt_idx])
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\utils\eval_utils.py", line 53, in eval
patient_results, test_error, auc, df, _ = summary(model, loader, args)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\utils\eval_utils.py", line 70, in summary
for batch_idx, (data, label) in enumerate(loader):
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\dataloader.py", line 652, in __next__
data = self._next_data()
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\dataloader.py", line 1347, in _next_data
return self._process_data(data)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\dataloader.py", line 1373, in _process_data
data.reraise()
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\_utils.py", line 461, in reraise
raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\_utils\worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\datasets\dataset_generic.py", line 340, in __getitem__
features = torch.load(full_path)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\serialization.py", line 699, in load
with _open_file_like(f, 'rb') as opened_file:
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\serialization.py", line 211, in __init__
super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\user\\PycharmProjects\\melanoma_pathology_clam\\slide images\\Features\\pt_files\\image1.pt'
but you have something named image1? it's looking for the corresponding features for each image in your csv file (.pt file)
I don't have any file named image1 and it is not listed in any csv file that I've used. I'm not sure why it's looking for an image1
are you sure? is "dataset_csv/test.csv" perhaps not the right file then?
yes, I've looked through the entire test.csv. I could email you my test.csv file too if that would be helpful
sure if you email me i can take a quick look
Thanks! Sorry actually I just noticed that I was working on a file called test.csv.csv. My new computer still has file extensions hidden and it was actually looking at a different file actually named test.csv. Would you be able to help me with one more error? It's telling me that it can't find one of the images but I'm sure it's there this time. I reached this error as well when I tried to manually put this in the eval column of the split file when training with main.py
Traceback (most recent call last):
File "eval.py", line 135, in <module>
model, patient_results, test_error, auc, df = eval(split_dataset, args, ckpt_paths[ckpt_idx])
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\utils\eval_utils.py", line 53, in eval
patient_results, test_error, auc, df, _ = summary(model, loader, args)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\utils\eval_utils.py", line 70, in summary
for batch_idx, (data, label) in enumerate(loader):
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\dataloader.py", line 652, in __next__
data = self._next_data()
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\dataloader.py", line 1347, in _next_data
return self._process_data(data)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\dataloader.py", line 1373, in _process_data
data.reraise()
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\_utils.py", line 461, in reraise
raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\_utils\worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\datasets\dataset_generic.py", line 340, in __getitem__
features = torch.load(full_path)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\serialization.py", line 699, in load
with _open_file_like(f, 'rb') as opened_file:
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "C:\Users\user\PycharmProjects\melanoma_pathology_clam\melanoma_pathology_clam\lib\site-packages\torch\serialization.py", line 211, in __init__
super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\user\\PycharmProjects\\melanoma_pathology_clam\\slide images\\Features\\pt_files\\YMTA3671_31779.pt'
i mean - it's basically saying "C:\Users\user\PycharmProjects\melanoma_pathology_clam\slide images\Features\pt_files\YMTA3671_31779.pt" does not exist - can you confirm? I don't think there's much i can do to help if it's errors due to filename parsing/locating files. To me it seems the code should be behaving as expected...
Otherwise I'm happy to look into things if you suspect any strange behavior with the code.
Thank you, I understand. I just confirmed that the file does exist. Do you think it have anything to do with the slashes are being read as double slashes? If not, maybe I'll just try reprocessing the images or something.
yeah could be - im not familiar with high file paths on windows are interpreted. if it helps: https://github.com/mahmoodlab/CLAM/blob/1a92ef234411b44ec9ba27551307aea1143b5b4e/datasets/dataset_generic.py#L339 is where the file path is built and used to find the .pt file. You could try to use this exact line outside the dataset class/eval script to see if you can correctly locate the .pt file in question and modify the code/filenames accordingly if necessary.
I was able to run train my models successfully with main.py but am now trying to use eval.py on an external dataset. I'm getting an error stated as
UnboundLocalError: local variable 'data' referenced before assignment
Is there something I'm entering wrong in the terminal command or eval.py script?
Here is what I put for eval.py lines 73-82:
Here are my inputs and outputs:
python eval.py --drop_out --k 10 --models_exp_code task_1_tumor_vs_normal_CLAM_80_s1 --save_exp_code task_1_tumor_vs_normal_CLAM_80_s1_cv --task task_1_tumor_vs_normal --model_type clam_sb --results_dir results --data_root_dir "C:\Users\user\PycharmProjects\melanoma_pathology_clam\slide images"