BMIRDS / deepslide

Code for the Nature Scientific Reports paper "Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks." A sliding window framework for classification of high resolution whole-slide images, often microscopy or histopathology images.
https://www.nature.com/articles/s41598-019-40041-7
GNU General Public License v3.0
488 stars 138 forks source link

About running code from 1st one, I got a problem. #30

Closed mofatuzi closed 4 years ago

mofatuzi commented 4 years ago

Hi, I am trying to go through the codes, and try it using my wsi images. But when I input my wsi folder path and run 1_split.py just like: python code/1_split.py --all_wsi deepslide-master\HE_datasets --val_wsi_per_class 10 --test_wsi_per_class 20 the result is: -4bvoQ5-4ke9ZsT3cS1kw-na It always add the wsi_train to the head of images path. And tell me "FileNotFoundError: [Errno 2] No such file or directory:" What should I do? I am fresh man of coding. Or, could you give me specific instructions (take an example) for running the codes. Thanks!

JosephDiPalma commented 4 years ago

@mofatuzi Can you send me the configuration print out that should have been displayed before this error?

mofatuzi commented 4 years ago

@mofatuzi Can you send me the configuration print out that should have been displayed before this error?

`# gd @ RMBP in ~/Documents/pyfile on git:master x [21:20:53] $ python he/deepslide-master/code/1_split.py --all_wsi /Users/gd/Documents/pyfile/he/deepslide-master/HE_datasets --val_wsi_per_class 10 --test_wsi_per_class 20 ############### CONFIGURATION ############### all_wsi: /Users/gd/Documents/pyfile/he/deepslide-master/HE_datasets val_wsi_per_class: 10 test_wsi_per_class: 20 keep_orig_copy: True num_workers: 8 patch_size: 224 wsi_train: wsi_train wsi_val: wsi_val wsi_test: wsi_test labels_train: labels_train.csv labels_val: labels_val.csv labels_test: labels_test.csv train_folder: train_folder patches_eval_train: patches_eval_train patches_eval_val: patches_eval_val patches_eval_test: patches_eval_test num_train_per_class: 80000 type_histopath: True purple_threshold: 100 purple_scale_size: 15 slide_overlap: 3 gen_val_patches_overlap_factor: 1.5 image_ext: jpg by_folder: True color_jitter_brightness: 0.5 color_jitter_contrast: 0.5 color_jitter_saturation: 0.5 color_jitter_hue: 0.2 num_epochs: 20 num_layers: 18 learning_rate: 0.001 batch_size: 16 weight_decay: 0.0001 learning_rate_decay: 0.85 resume_checkpoint: False save_interval: 1 checkpoints_folder: checkpoints checkpoint_file: xyz.pt pretrain: False log_folder: logs auto_select: True preds_train: preds_train preds_val: preds_val preds_test: preds_test inference_train: inference_train inference_val: inference_val inference_test: inference_test vis_train: vis_train vis_val: vis_val vis_test: vis_test device: cpu classes: ['high_cj', 'low_cj', 'wsi_test', 'wsi_train', 'wsi_val'] num_classes: 5 train_patches: train_folder/train val_patches: train_folder/val path_mean: [0.9690558910369873, 0.9547340869903564, 0.9746440649032593] path_std: [0.09160317480564117, 0.12522628903388977, 0.06992998719215393] resume_checkpoint_path: checkpoints/xyz.pt log_csv: logs/log_822020_212932.csv eval_model: checkpoints/xyz.pt threshold_search: (0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) colors: ('red', 'white', 'blue', 'green', 'purple', 'orange', 'black', 'pink', 'yellow')

#####################################################

+++++ Running 1_split.py +++++ class high_cj #train=18 #val=10 #test=20 Traceback (most recent call last): File "he/deepslide-master/code/1_split.py", line 15, in wsi_val=config.args.wsi_val) File "/Users/gd/Documents/pyfile/he/deepslide-master/code/utils_split.py", line 110, in split move_set(folder=wsi_train, image_files=train_images, ops=head)) File "/Users/gd/Documents/pyfile/he/deepslide-master/code/utils_split.py", line 81, in move_set dst=folder.joinpath(remove_topdir(filepath=image_file))) File "/Users/gd/opt/anaconda3/lib/python3.7/shutil.py", line 121, in copyfile with open(dst, 'wb') as fdst: FileNotFoundError: [Errno 2] No such file or directory: 'wsi_train/Users/gd/Documents/pyfile/he/deepslide-master/HE_datasets/high_cj/10.jpg'` carbon (2)

Thanks for youe code! It may help me to identify Pulmonary Fibrosis tissues. But I don't know how to use it.

JosephDiPalma commented 4 years ago

@mofatuzi The problem is that you stored the data in the same directory as the splits. Organize your directories as follows:

├── all_wsi
│   ├── high_cj
│   ├── low_cj
├── wsi_train
│   ├── high_cj
│   ├── low_cj
├── wsi_val
│   ├── high_cj
│   ├── low_cj
├── wsi_test
│   ├── high_cj
│   ├── low_cj
mofatuzi commented 4 years ago

@JosephDiPalma I thought that folders 'wsi_train, wsi_val, wsi_test' are automatically created by the code, is that right? Now, there are some wsi images which classed by folder high_cj and low_cj that under all_wsi. You mean I should creat empty "wsi_train" folder and (high_cj and low_cj) folders in it , and empty "wsi_val" ....... manually by my self?
Or I should input parameters (paths of "wsi_train, wsi_val, wsi_test" folders)? just like: $ python he/deepslide-master/code/1_split.py --all_wsi /Users/gd/Documents/pyfile/he/deepslide-master/HE_datasets/all_wsi --wsi_train /Users/gd/Documents/pyfile/he/deepslide-master/HE_datasets/wsi_train --wsi_val /Users/gd/Documents/pyfile/he/deepslide-master/HE_datasets/wsi_val --wsi_test /Users/gd/Documents/pyfile/he/deepslide-master/HE_datasets/all_wsi --wsi_test --val_wsi_per_class 10 --test_wsi_per_class 20

mofatuzi commented 4 years ago

I try to input the path of folder manually created. But it still have an error. I changed my computer, the path may different.

`# gd @ gd-c8 in ~/git-learn/deepslide on git:master x [9:35:21] C:2 $ python try/code/1_split.py --all_wsi try/all_wsi --wsi_train try/wsi_train --wsi_val try/wsi_val --wsi_test try/wsi_test --val_wsi_per_class 10 --test_wsi_per_class 20 ############### CONFIGURATION ############### all_wsi: try/all_wsi val_wsi_per_class: 10 test_wsi_per_class: 20 keep_orig_copy: True num_workers: 8 patch_size: 224 wsi_train: try/wsi_train wsi_val: try/wsi_val wsi_test: try/wsi_test labels_train: labels_train.csv labels_val: labels_val.csv labels_test: labels_test.csv train_folder: train_folder patches_eval_train: patches_eval_train patches_eval_val: patches_eval_val patches_eval_test: patches_eval_test num_train_per_class: 80000 type_histopath: True purple_threshold: 100 purple_scale_size: 15 slide_overlap: 3 gen_val_patches_overlap_factor: 1.5 image_ext: jpg by_folder: True color_jitter_brightness: 0.5 color_jitter_contrast: 0.5 color_jitter_saturation: 0.5 color_jitter_hue: 0.2 num_epochs: 20 num_layers: 18 learning_rate: 0.001 batch_size: 16 weight_decay: 0.0001 learning_rate_decay: 0.85 resume_checkpoint: False save_interval: 1 checkpoints_folder: checkpoints checkpoint_file: xyz.pt pretrain: False log_folder: logs auto_select: True preds_train: preds_train preds_val: preds_val preds_test: preds_test inference_train: inference_train inference_val: inference_val inference_test: inference_test vis_train: vis_train vis_val: vis_val vis_test: vis_test device: cuda:0 classes: ['a', 'n'] num_classes: 2 train_patches: train_folder/train val_patches: train_folder/val path_mean: [0.9690560102462769, 0.9547340869903564, 0.9746439456939697] path_std: [0.09160252660512924, 0.12522628903388977, 0.0699312686920166] resume_checkpoint_path: checkpoints/xyz.pt log_csv: logs/log_832020_9399.csv eval_model: checkpoints/xyz.pt threshold_search: (0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) colors: ('red', 'white', 'blue', 'green', 'purple', 'orange', 'black', 'pink', 'yellow')

#####################################################

+++++ Running 1_split.py +++++ class a #train=18 #val=10 #test=20 Traceback (most recent call last): File "try/code/1_split.py", line 15, in wsi_val=config.args.wsi_val) File "/home/gd/git-learn/deepslide/try/code/utils_split.py", line 112, in split move_set(folder=wsi_train, image_files=train_images, ops=head)) File "/home/gd/git-learn/deepslide/try/code/utils_split.py", line 83, in move_set dst=(remove_topdir(filepath=image_file))).joinpath(folder) File "/home/gd/anaconda3/lib/python3.7/shutil.py", line 121, in copyfile with open(dst, 'wb') as fdst: FileNotFoundError: [Errno 2] No such file or directory: 'all_wsi/a/0010.jpg'`

the dictionary structure is 2020-08-03 09-51-57 2020-08-03 14-42-51a

JosephDiPalma commented 4 years ago

@mofatuzi I organized my directories as follows:

├── all_wsi
│   ├── a
│   ├── n
├── code
│  ***

Now if you run python code/1_split.py the dataset should be split as desired.

mofatuzi commented 4 years ago

@JosephDiPalma Thanks a lot! It works. I will try to run the code on my data. After running all the steps, which step should I run for predicting another similar dataset using the same model?

JosephDiPalma commented 4 years ago

@mofatuzi To predict on another dataset, perform the following steps:

  1. The other dataset should have the same classes otherwise using the same model will not work.
  2. Run python code/1_split.py and python code/2_process_patches.py to make the new dataset have the correct format.
  3. Lastly run python code/7_final_test.py to view the final results on the new data. You could also run the 5th and 6th stage files if needed.
mofatuzi commented 4 years ago

@JosephDiPalma There is another question.

The code have already ran for my data. My data have two classes by animal: Pulmonary Fibrosis tissues which is infected by virus and normal tissues which is control. Pulmonary Fibrosis tissues can be indentified by the code, but the normal tissue of lungs can not be regnized. The Visualization images show only some dots on the abnormal area of normal tissues. So, is H&E images no need to annotate? I have the whole images of histologic setction. But each image contians several kinds of patterns, like Fibrosis, Blood vessels, other pathological changes. Abnormal tissues contain large area of pathological changes and normal tissues only contans a few of changes. How can I classify the images? Should I just roughly seprate them into different classes by main pathological patterns in each image and label the class folder name by the main pathological patterns? This is my abnormal tissue class after running codes: 0012_predictions This is my normal tissue class after running codes: Nomal patterns are not labeled by color dots. The green dots indicate abnormal parts. 0027_predictions

Is the codes only find abnormal patterns? How can the codes know it's normal tissue?

JosephDiPalma commented 4 years ago

@mofatuzi The code can find both abnormal and normal classes. The original dataset for the paper this code was used in had annotations for each class. It would likely help if you annotated your slides, but we've also used this model without annotations successfully.

mofatuzi commented 4 years ago

@JosephDiPalma I am still confusing about the class. How can I separate all the wsi images into different classes? Because every wsi image contains some kinds of diseased tissues, just like your wsi contains lepidic, acinar, papillary in one wsi, which class subfolder (named by disease,annotation) should I put the image in? Or it doesn't matter what kind annotation of folder name. Only thing I need to do is separate all wsi into 5 classes (5 subfolders) if there is 5 kinds of diseased tissues. No matter what kind of wsi in each class? Is that so called "Unsupervised Learning"? You mentioned "If each WSI can have multiple types of lesions/labels, you may need to annotate bounding boxes around these." How to "annotate bounding boxes around these", and box it before which step? Is the boxes can be recognized by the codes?