wentaozhu / DeepLung

WACV18 paper "DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification"
Apache License 2.0
316 stars 143 forks source link

Unable to match the pretrained performance in the folds. #123

Closed PrateekMunjal closed 4 years ago

PrateekMunjal commented 4 years ago

Hi, thanks for sharing your work!

I have been trying to work on your pre-trained models shared for 3d res and 3d dpn. To begin I started with resnet models and as of now, I am not able to get the performance match for each of the folds mentioned in the supplementary section of your paper. I will now enlist the steps which I made inorder to make your codebase run and end with the performance I achieve, kindly let me know if I made a mistake at any of the steps.

  1. Made a run_test.sh for testing the pretrained models.
    
    #!/bin/bash
    set -e

python prepare.py

lr & weight decay values taken from paper experimental section -- for testing ignore them

cd detector maxeps=150

model_name=res18

f=3 i=6

echo "process $i epoch" echo "loading resmodel/res18fd${f}00$i.ckpt" echo ${PWD}

if [ $i -lt 10 ]; then CUDA_VISIBLE_DEVICES=5,6 python main.py --lr 0.01 --wd 1e-4 --model $model_name -b 32 --resume resmodel/res18fd${f}00$i.ckpt --test 1 --save-dir resmodel/res18fd$f/ --config config_training$f elif [ $i -lt 100 ]; then CUDA_VISIBLE_DEVICES=5,6 python main.py --lr 0.001 --wd 1e-4 --model $model_name -b 32 --resume resmodel/res18fd${f}0$i.ckpt --test 1 --save-dir resmodel/res18fd$f/ --config config_training$f elif [ $i -lt 1000 ]; then CUDA_VISIBLE_DEVICES=5,6 python main.py --lr 0.0001 --wd 1e-4 --model $model_name -b 32 --resume resmodel/res18fd${f}$i.ckpt --test 1 --save-dir resmodel/res18fd$f/ --config config_training$f else echo "Unhandled case" fi

echo "i: $i" if [ ! -d "results/resmodel/res18fd$f/val$i/" ]; then echo "Creating directory.." mkdir -p results/resmodel/res18fd$f/val$i/ fi mv results/resmodel/res18fd$f/bbox/*.npy results/resmodel/res18fd$f/val$i/

2. Running this for each fold and we obtain _lbb & _pbb for each study(".mhd" file) in corresponding fold.
![image](https://user-images.githubusercontent.com/7415240/88799224-2ea4e400-d1b7-11ea-837c-473613a1af01.png)

3.Now, following your comments from [105](https://github.com/wentaozhu/DeepLung/issues/105), [42](https://github.com/wentaozhu/DeepLung/issues/42#issuecomment-435445688) -- I created 10 sub-annotations (from following script) from original annotations.csv provided in evaluationScript/annotations/annotations.csv

all_exclude_fpath = '/raid/shadab/prateek/dgx30/prateek/DeepLung/evaluationScript/annotations/annotations_excluded.csv'

original_annots_fpath = '/raid/shadab/prateek/dgx30/prateek/DeepLung/luna16/CSVFILES/annotations.csv'

ex_csv_rdr = csv.reader(open(all_exclude_fpath), delimiter=',')

orig_csv_rdr = csv.reader(open(original_annots_fpath), delimiter=',')

all_excluded_annots = None

all_original_annotations = None

get all excluded annotations

for row in ex_csv_rdr:

if all_excluded_annots == None:

all_excluded_annots = [row[0]]

else:

if row[0] not in all_excluded_annots:

all_excluded_annots.append(row[0])

fold=0

for fold in range(10):

subset_fpath = '/raid/shadab/prateek/dgx30/prateek/DeepLung/luna16_preprocess/subset'+str(fold)
subset_fnames = sorted(glob(os.path.join(subset_fpath, '*_clean.npy')))
#extract seriesuid and get rid of full path and suffix
subset_fnames = [os.path.basename(f).rsplit('_clean.npy')[0] for f in subset_fnames]
# print(subset_fnames)

#get the annotations for only fold
fold_annotations_fname='annotations'+str(fold)+'.csv'
fold_annotations=[]
fold_annots_writer = csv.writer(open(os.path.join(subset_fpath, fold_annotations_fname), 'w'), delimiter='\n')

fold_sids_fname='seriesids'+str(fold)+'.csv'
fold_sids=[]
fold_sids_writer = csv.writer(open(os.path.join(subset_fpath, fold_sids_fname),'w'), delimiter='\n')

#read original annots reader & select seriesids which are common with 
orig_csv_rdr = csv.reader(open(original_annots_fpath), delimiter=',')
for row in orig_csv_rdr:
    curr_sid = row[0]
    if curr_sid in subset_fnames:
        fold_annotations.append((",").join(row))
        if curr_sid not in fold_sids:
            fold_sids.append(curr_sid)

fold_sids_writer.writerow(fold_sids)
print('Written '+os.path.join(subset_fpath, fold_sids_fname))
fold_annots_writer.writerow(fold_annotations)
print('Written '+os.path.join(subset_fpath, fold_annotations_fname))
print
4. Now assigning the correct paths(added snapshot to verify) in frocwrtdetpepchluna16.py [Sidenote: Also uncommented the getcsv function call to create csv files for each detp.] 
![image](https://user-images.githubusercontent.com/7415240/88800606-7dec1400-d1b9-11ea-813f-0f607fe3da3e.png)

5. After this I manually track the results to select best detp value for each fold from the output of frocwrtdetpepchluna16.py. Although, the original code compares against multiple models & then outputs best epoch and dept but because I am using pretrained models there is no point of comparison against multiple models.

6. Now after I know each folds best annotation file (for eg fold 0 has detp=-2 best and fold 1 has detp=-1.5 best) then I have written a script to combine the results to get a big annotation file similar to the one you provided [here](https://github.com/wentaozhu/DeepLung/blob/master/evaluationScript/annotations/3DRes18FasterR-CNN.csv).

To combine best-csv files from each fold

startline = ["seriesuid,coordX,coordY,coordZ,probability"]

par_fpath = '/raid/shadab/prateek/dgx30/prateek/DeepLung/evaluationScript/annotations' fname = 'my3DRes18_annotations.csv' annot_writer = csv.writer(open(os.path.join(par_fpath, fname),'w'), delimiter='\n') annot_content = [] annot_writer.writerow(startline)

detps = [-1.5, -1.5, -1.5, -1.5, -2, -2, -1.5, -1.5, -1.5, -1.5]

epoch_nums = [8,49,20,6,143,97,28,24,14,20] folds = [i for i in range(10)] detps = [-2 for i in range(10)]

fold_annot_fpath = '/raid/shadab/prateek/dgx30/prateek/DeepLung/detector/results/resmodel/res18fd' #0/val8'

for fold,ep,detp in zip(folds, epoch_nums, detps): temp_annot_fpath = fold_annot_fpath + str(fold) + '/val'+str(ep)

temp_annot_path = os.path.join(temp_annot_fpath, 'predanno'+str(detp)+'.csv')
temp_csv_reader = csv.reader(open(temp_annot_path, 'r'))
firstLine = True
for row in temp_csv_reader:
    if firstLine:
        firstLine = False
        continue
    annot_content.append(",".join(row))

annot_writer.writerow(annot_content)


5. Now after Running noduleCADEvaluationLUNA16.py by adjusting path of "results_filename" to the one we get after combining from step 4. I get the following result: 
![froc_performance](https://user-images.githubusercontent.com/7415240/88801873-5b5afa80-d1bb-11ea-8194-db15c7fc47b4.png)

Quantitatively, the performance on each fold is as follows
| Folds   |      performance      |  paper reported |
|:----------:|:-------------:|:------:|
| 0 |  85.841 | 86.1 |
| 1 |  84.821 | 85.38 |
| 2 |  **75.78** | 79.02 |
| 3 |  77.43 | 78.63 |
| 4 |  86.942 | 87.95 |
| 5 |  82.672 | 83.6 |
| 6 |  **84.415** | 89.59 |
| 7 |  87.516 | 87 |
| 8 |  87.046 | 88.86 |
| 9 |  80.494 | 80.41 |

Kindly let me know your comments? It would be best if you can help me verify the performance for fold 2 and fold 6. 
wentaozhu commented 4 years ago

# detps = [-1.5, -1.5, -1.5, -1.5, -2, -2, -1.5, -1.5, -1.5, -1.5] epoch_nums = [8,49,20,6,143,97,28,24,14,20] folds = [i for i in range(10)] detps = [-2 for i in range(10)] It seems you use detection probability as -2 for all the folds. Maybe you can have a try to see whether it is the issue.

PrateekMunjal commented 4 years ago

Thanks for your quick reply!

I don't think that to be an issue, because even if I narrow down to match the performance for only fold 2 and fold 6 as they have the most difference.

For fold 2, image

For fold 6, image

So changing value from -2 won't make any difference.

Also just to be clear that in the above 2 it shows exactly same performance does not mean it gives same everytime. For example, check following where best is -2. image

Other than detps = [-2 for i in range(10)], do you see any other step I understood or done wrong?

wentaozhu commented 4 years ago

How about trying more detection probability [-3, -2.5, -2, -1.5, -1, -0.5, 0, 0.5, 1, 1.5, 2]? Your performances on other folds are better than mine. So the procedure should be correct. The difference probably is the hyper-parameter of the detection probability.