Abe404 / root_painter

RootPainter: Deep Learning Segmentation of Biological Images with Corrective Annotation
https://nph.onlinelibrary.wiley.com/doi/full/10.1111/nph.18387
Other
53 stars 12 forks source link

Complex input image filenames will crash the server when training #87

Closed thiagomaf closed 1 year ago

thiagomaf commented 1 year ago

Complex input image filenames will crash the server when training.

Filenames such as EXP-006_20230131_20_2[96dpi]_{x}_{y}.jpg will crash the server immediately after starting to train network. Same image set renamed to A_{x}_{y}.jpg will run and train fine.

To Reproduce Created a training dataset with filename patter "EXP-006_20230131_202[96dpi]{x}_{y}.jpg".

I have followed the Colab tutorial here without any changes - https://colab.research.google.com/drive/104narYAvTBt-X4QEDrBSOZm_DRaAKHtA?usp=sharing

Got the following error:

execute_instruction start_training Traceback (most recent call last): File "/content/drive/MyDrive/root_painter_src/trainer/main.py", line 40, in trainer.main_loop() File "/content/drive/MyDrive/root_painter_src/trainer/trainer.py", line 113, in main_loop self.train_one_epoch() File "/content/drive/MyDrive/root_painter_src/trainer/trainer.py", line 244, in train_one_epoch for step, (photo_tiles, File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 628, in next data = self._next_data() File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1333, in _next_data return self._process_data(data) File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1359, in _process_data data.reraise() File "/usr/local/lib/python3.8/dist-packages/torch/_utils.py", line 543, in reraise raise exception Exception: Caught Exception in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/content/drive/MyDrive/root_painter_src/trainer/datasets.py", line 106, in getitem image, annot, fname = load_train_image_and_annot(self.dataset_dir, File "/content/drive/MyDrive/root_painter_src/trainer/im_utils.py", line 95, in load_train_image_and_annot raise Exception(f'Could not load photo {latest_im_path}, {latest_error}') Exception: Could not load photo None, list index out of range

Abe404 commented 1 year ago

Thanks for bringing this to my attention and providing details!

The problem was that I was using glob to search for file paths and certain characters (such as [) have a special meaning when using glob. To allow arbitrary file names that may include these special characters I have altered the code to use glob.escape

The relevent changes are included in the following commits: https://github.com/Abe404/root_painter/commit/d2f881418890b41dbac02791c26082ecc724d026 https://github.com/Abe404/root_painter/commit/13e53e32f5832dd8175787bdf3b39099dd1dc07a

This should have now resolved the problem you were experiencing. Can you pull the latest code for the server again (cloning also does this) and let me know if everything is working OK with your files now?

Kind regards, Abraham