cchen-cc / SFDA-DPL

[MICCAI'21] Source-Free domain adaptive fundus image segmentation with denoised pseudo-labeling
55 stars 11 forks source link

The expriment resulrts in the paper can't be reimplemented! #1

Open Leesoon1984 opened 3 years ago

Leesoon1984 commented 3 years ago

0%| | 0/2 [00:00<?, ?it/s]cup: 0.8066 disc: 0.9631 avg: 0.8848 cup: 13.1680 disc: 4.2408 avg: 8.7044 best cup: 0.8066 best disc: 0.9631 best avg: 0.8848 best cup: 13.1680 best disc: 4.2408 best avg: 8.7044 50%|█████████████████▌ | 1/2 [00:17<00:17, 17.49s/it]cup: 0.8178 disc: 0.9654 avg: 0.8916 cup: 12.3657 disc: 4.0131 avg: 8.1894 best cup: 0.8178 best disc: 0.9654 best avg: 0.8916 best cup: 12.3657 best disc: 4.0131 best avg: 8.1894 100%|███████████████████████████████████| 2/2 [00:34<00:00, 17.42s/it]

cchen-cc commented 3 years ago

Did you use the source model we uploaded?

Leesoon1984 commented 3 years ago

Did you use the source model we uploaded?

==> Loading DeepLab model file: ./logs/source/source_model.pth.tar

source_model.pth.tar is downloaded from here

cchen-cc commented 3 years ago

How about the results for the other test dataset?

Leesoon1984 commented 3 years ago

How about the results for the other test dataset?

==> Loading DeepLab model file: ./logs/source/source_model.pth.tar 0%| | 0/2 [00:00<?, ?it/s]cup: 0.7558 disc: 0.9048 avg: 0.8303 cup: 10.8793 disc: 9.1118 avg: 9.9955 best cup: 0.7558 best disc: 0.9048 best avg: 0.8303 best cup: 10.8793 best disc: 9.1118 best avg: 9.9955 50%|█████████████████▌ | 1/2 [00:32<00:32, 32.57s/it]cup: 0.7666 disc: 0.9027 avg: 0.8346 cup: 10.4864 disc: 9.9553 avg: 10.2209 best cup: 0.7666 best disc: 0.9027 best avg: 0.8346 best cup: 10.4864 best disc: 9.9553 best avg: 10.2209 100%|███████████████████████████████████| 2/2 [01:04<00:00, 32.13s/it]

The results on the RIM-ONE-r3 dataset. It seems that the Optic Cup Segmentation results are lower.

cchen-cc commented 3 years ago

Strange, will have a check.

caijinyue commented 3 years ago

How about the results for the other test dataset?

==> Loading DeepLab model file: ./logs/source/source_model.pth.tar 0%| | 0/2 [00:00<?, ?it/s]cup: 0.7558 disc: 0.9048 avg: 0.8303 cup: 10.8793 disc: 9.1118 avg: 9.9955 best cup: 0.7558 best disc: 0.9048 best avg: 0.8303 best cup: 10.8793 best disc: 9.1118 best avg: 9.9955 50%|█████████████████▌ | 1/2 [00:32<00:32, 32.57s/it]cup: 0.7666 disc: 0.9027 avg: 0.8346 cup: 10.4864 disc: 9.9553 avg: 10.2209 best cup: 0.7666 best disc: 0.9027 best avg: 0.8346 best cup: 10.4864 best disc: 9.9553 best avg: 10.2209 100%|███████████████████████████████████| 2/2 [01:04<00:00, 32.13s/it]

The results on the RIM-ONE-r3 dataset. It seems that the Optic Cup Segmentation results are lower.

It may be caused by the different versions of torch and cuda.

dataset: RIM-ONE-r3 pytorch: 0.4.1 (the same as readme) cuda: 9.0 (the same as readme)

torch0 4 1

dataset: RIM-ONE-r3 pytorch: 1.7.1 cuda: 10.2

torch1 7 1
cchen-cc commented 3 years ago

@Leesoon1984 Did you use higher pytorch and cuda version?

Leesoon1984 commented 3 years ago

@Leesoon1984 Did you use higher pytorch and cuda version?

pytorch: 1.9.0 cuda: 11.1

I train the source model from scratch and use it to generate pseudo labels.

==> Loading DeepLab model file: ./logs/Domain2/20210925_174225.317580/checkpoint_170.pth.tar 0%| | 0/2 [00:00<?, ?it/s]cup: 0.7717 disc: 0.9595 avg: 0.8656 cup: 15.5761 disc: 4.5604 avg: 10.0683 best cup: 0.7717 best disc: 0.9595 best avg: 0.8656 best cup: 15.5761 best disc: 4.5604 best avg: 10.0683 50%|█████████████████▌ | 1/2 [00:28<00:28, 28.76s/it]cup: 0.8189 disc: 0.9666 avg: 0.8927 cup: 12.3946 disc: 3.7113 avg: 8.0529 best cup: 0.8189 best disc: 0.9666 best avg: 0.8927 best cup: 12.3946 best disc: 3.7113 best avg: 8.0529 100%|███████████████████████████████████| 2/2 [00:46<00:00, 23.25s/it]

cchen-cc commented 3 years ago

@Leesoon1984 The results difference seems to be caused by the different PyTorch version. I'm not sure why higher PyTorch version will lead to performance decrease on the optic cup segmentation (the results for optic disc are comparable). If you could try to use the version 0.4.1, which is the version we used to obtain the results in the paper, the results are reproducible.

Leesoon1984 commented 3 years ago

@Leesoon1984 The results difference seems to be caused by the different PyTorch version. I'm not sure why higher PyTorch version will lead to performance decrease on the optic cup segmentation (the results for optic disc are comparable). If you could try to use the version 0.4.1, which is the version we used to obtain the results in the paper, the results are reproducible.

BEAL:test.py, the predictions are post-processed differently for RIM-ONE-r3 and Drishti-GS dataset. prediction = postprocessing(prediction.data.cpu()[0], dataset=args.dataset)

` def postprocessing(prediction, threshold=0.75, dataset='G'): if dataset[0] == 'D': prediction = prediction.numpy() prediction_copy = np.copy(prediction) disc_mask = prediction[1] cup_mask = prediction[0] disc_mask = (disc_mask > 0.5) # return binary mask cup_mask = (cup_mask > 0.1) # return binary mask disc_mask = disc_mask.astype(np.uint8) cup_mask = cup_mask.astype(np.uint8) for i in range(5): disc_mask = scipy.signal.medfilt2d(disc_mask, 7) cup_mask = scipy.signal.medfilt2d(cup_mask, 7) disc_mask = morphology.binary_erosion(disc_mask, morphology.diamond(7)).astype(np.uint8) # return 0,1 cup_mask = morphology.binary_erosion(cup_mask, morphology.diamond(7)).astype(np.uint8) # return 0,1 disc_mask = get_largest_fillhole(disc_mask).astype(np.uint8) # return 0,1 cup_mask = get_largest_fillhole(cup_mask).astype(np.uint8) prediction_copy[0] = cup_mask prediction_copy[1] = disc_mask return prediction_copy else: prediction = prediction.numpy() prediction = (prediction > threshold) # return binary mask prediction = prediction.astype(np.uint8) prediction_copy = np.copy(prediction) disc_mask = prediction[1] cup_mask = prediction[0]

for i in range(5):

    #     disc_mask = scipy.signal.medfilt2d(disc_mask, 7)
    #     cup_mask = scipy.signal.medfilt2d(cup_mask, 7)
    # disc_mask = morphology.binary_erosion(disc_mask, morphology.diamond(7)).astype(np.uint8)  # return 0,1
    # cup_mask = morphology.binary_erosion(cup_mask, morphology.diamond(7)).astype(np.uint8)  # return 0,1
    disc_mask = get_largest_fillhole(disc_mask).astype(np.uint8)  # return 0,1
    cup_mask = get_largest_fillhole(cup_mask).astype(np.uint8)
    prediction_copy[0] = cup_mask
    prediction_copy[1] = disc_mask
    return prediction_copy

`

In this code,

prediction[prediction>0.75] = 1;prediction[prediction <= 0.75] = 0 both for RIM-ONE-r3 and Drishti-GS dataset. Can you tell the differences with the BEAL:test.py, or consider providing test.py code?

cchen-cc commented 3 years ago

The evaluation code is included in the train_target.py and we did not use a separate evaluation script. We use the same prediction process for both datasets instead of specifically adjusting the threshold.

SakurajimaMaiii commented 3 years ago

I got the result on dataset RIM_ONE r3 pytorch 1.8.1 python3.8 cuda 10.2 cup dice : 0.7751 disc dice: 0.9053 cup hd: 9.9344 disc hd: 9.9522 just for reference

Brandy0k commented 2 years ago

I got the result when using the domain 2 as target domain : best cup: 0.7611 best disc: 0.9120 best avg: 0.8365 best cup: 10.8140 best disc: 8.3340 best avg: 9.5740

Hazusa commented 2 years ago

I got the result when using the domain 2 as target domain : best cup: 0.7611 best disc: 0.9120 best avg: 0.8365 best cup: 10.8140 best disc: 8.3340 best avg: 9.5740

Did you use the author's soure model?

Brandy0k commented 2 years ago

I got the result when using the domain 2 as target domain : best cup: 0.7611 best disc: 0.9120 best avg: 0.8365 best cup: 10.8140 best disc: 8.3340 best avg: 9.5740

Did you use the author's soure model?

I did. I did not train from scatch.

Hazusa commented 2 years ago

I got the result when using the domain 2 as target domain : best cup: 0.7611 best disc: 0.9120 best avg: 0.8365 best cup: 10.8140 best disc: 8.3340 best avg: 9.5740

Did you use the author's soure model?

I did. I did not train from scatch.

When i use the file "source_model.pth.tar"

I got the result when using the domain 2 as target domain : best cup: 0.7611 best disc: 0.9120 best avg: 0.8365 best cup: 10.8140 best disc: 8.3340 best avg: 9.5740

Did you use the author's soure model?

I did. I did not train from scatch.

The following error occured when i use the file "source_model.pth.tar"

==> Loading DeepLab model file: ./logs/source/source_model.pth.tar 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "D:/MyDownloads/SFDA-DPL-main/train_target.py", line 211, in val_cup_dice /= datanum_cnt ZeroDivisionError: float division by zero

Does this mean the model file is wrong?

Brandy0k commented 2 years ago

I got the result when using the domain 2 as target domain : best cup: 0.7611 best disc: 0.9120 best avg: 0.8365 best cup: 10.8140 best disc: 8.3340 best avg: 9.5740

Did you use the author's soure model?

I did. I did not train from scatch.

When i use the file "source_model.pth.tar"

I got the result when using the domain 2 as target domain : best cup: 0.7611 best disc: 0.9120 best avg: 0.8365 best cup: 10.8140 best disc: 8.3340 best avg: 9.5740

Did you use the author's soure model?

I did. I did not train from scatch.

The following error occured when i use the file "source_model.pth.tar"

==> Loading DeepLab model file: ./logs/source/source_model.pth.tar 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "D:/MyDownloads/SFDA-DPL-main/train_target.py", line 211, in val_cup_dice /= datanum_cnt ZeroDivisionError: float division by zero

Does this mean the model file is wrong?

I met the same problem. I guess the problem is that your dataset path is wrong and the model do not have input images. You can debug it.

Hazusa commented 2 years ago

@Brandy0k May I ask what directory is your dataset located in? I tried to change the dataset path and get the RunTimeError.

Brandy0k commented 2 years ago

@Hazusa The path of dataset is "/data/***/SFDA-VESSEL/datasets" on linux. I think you should check the dataloader can not read images first. If the dataloader can read images correctly, I think this problem will not happen.

Hazusa commented 2 years ago

@Hazusa The path of dataset is "/data/***/SFDA-VESSEL/datasets" on linux. I think you should check the dataloader can not read images first. If the dataloader can read images correctly, I think this problem will not happen.

Thanks for your reply, i will try it.