ZhendongWang6 / DIRE

[ICCV 2023] Official implementation of the paper: "DIRE for Diffusion-Generated Image Detection"
306 stars 24 forks source link

The training set data includes the test set data. #22

Open Teddy12155555 opened 11 months ago

Teddy12155555 commented 11 months ago

I am reproducing this paper, but I encountered some problems during testing. I tested using the following steps, but found some weird things. Is there anything wrong with my steps?

My steps

  1. from the OneDrive location datasets/DiffsuionForensics/dire/train/lsun_bedroom, I downloaded the real dire images real.tar.gz and unpacked them as training set. (There are 40,000 images in real class. I'm trying to re-train the model)

  2. from the OneDrive location datasets/DiffsuionForensics/dire/test/lsun_bedroom/lsun_bedroom, I downloaded the real dire images real.tar.gz and unpacked them as testing set (There are 1,000 images in real class.)

  3. I found that some images in the testing set were also included in the training set. For example: I can found 022f8c89734486038ba814b5d2b8259cd58695a5.jpg, 022f12f989ed6bc972ed921903cd2e2f996a2a85.jpg, 022f19bf68b72ce134f60d246fd48808b1bc5187.jpg ....... both in training set and testing set. (I don't know exactly how much data is included, but I found quite a bit (about 620 images) in the 'real' class.)

  4. I can get results that are the same as those in the paper:

    'test_model:lsun_adm' model testing on...
    lsun_adm_data:
    ACC: 1.00000
    AP: 1.00000
    R_ACC: 1.00000
    F_ACC: 1.00000

    But I am curious, if the testing set is already included in the training set, does it mean that the model has seen it during the training phase, and therefore performs well in the test phase?

ZhendongWang6 commented 11 months ago

Thank you for bringing this to our attention, and I appreciate your diligence in checking the real and test dataset. It appears that there was an oversight in the version of the real test set for LSUN Bedroom that we initially provided.

I have taken immediate action to rectify the situation. The correct versions of the images, along with their corresponding reconstructions and dire representations, have now been reuploaded. We have checked that the updated test set aligns with the results reported in our paper.

Your assistance has been invaluable in identifying this discrepancy, and I sincerely apologize for any inconvenience it may have caused. If you encounter any further issues or have additional feedback, please don't hesitate to let us know.

Thank you once again for your understanding and assistance.

Teddy12155555 commented 11 months ago

Thank you for your reply. Could you also explain Issue #11 ? I have the same problem with the quality of images; it seems that the model is very sensitive to compression.

dong12003 commented 7 months ago

Hello, do you still have the dataset for this project? The dataset link is now broken. If you have it, could you please send me a copy? Thank you very much.