tim-learn / SHOT

code released for our ICML 2020 paper "Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation"
MIT License
429 stars 78 forks source link

Can not reproduce source only score of Office-31 #21

Closed hankyul2 closed 2 years ago

hankyul2 commented 2 years ago

Hi, thank you for sharing your work.

I tried to reproduce Office-31 source only score using your work, but it failed to get same result as reported in table3 of result.md

Cloned your work and run (referenced run.sh)

SEED A - D A - W D - A D - W W - A W - D AVG.
2019 79.72 74.84 57.97 94.59 62.83 98.59  
2020 79.72 72.70 59.03 95.22 61.87 97.99  
2021 80.52 76.98 59.46 93.58 63.72 98.80  
Avg. 79.99 74.84 58.82 94.46 62.81 98.49 78.24

Table 3 of result.md

SEED A - D A - W D - A D - W W - A W - D AVG.
2019 79.9 77.5 58.9 95.0 64.6 98.4  
2020 81.5 75.8 61.6 96.0 63.3 99.0  
2021 80.9 77.5 60.2 94.8 62.9 98.8  
Avg. 80.8 76.9 60.3 95.3 63.6 98.7 79.3

Here is what I did

  1. Clone repo

  2. Download office-31 and place it under object/data folder and create amazon_list.txt, dslr_list.txt, webcam_list.txt in the same folder using below code.

    import glob
    import os
    import random
    
    def load_data_list(root):
        class_list = glob.glob(os.path.join(root, '*'))
        data_list = []
        for label, class_path in enumerate(class_list):
            data_list.extend([f'{path} {label}\n' for path in glob.glob(os.path.join(class_path, '*'))])
        random.shuffle(data_list)
        return data_list
    
    def save_to_txt(save_path, data_list):
        with open(save_path, 'wt') as f:
            f.writelines(data_list)
    
    for dataset in ['amazon', 'dslr', 'webcam']:
        data_list = load_data_list(f'data/office/{dataset}/images')
        print(dataset, len(data_list))
        save_to_txt(f'data/office/{dataset}_list.txt', data_list)
  3. Train model using below command. I tried seed as follow: 2019, 2020, 2021.

    python3 image_source.py --trte val --da uda --output ckps/source/ --gpu_id 0 --dset office --max_epoch 100 --s 0

If anyone can spot any problems, I will be really appreciate it.

tim-learn commented 2 years ago

Hi, thank you for sharing your work.

I tried to reproduce Office-31 source only score using your work, but it failed to get same result as reported in table3 of result.md

Cloned your work and run (referenced run.sh)

SEED A - D A - W D - A D - W W - A W - D AVG. 2019 79.72 74.84 57.97 94.59 62.83 98.59   2020 79.72 72.70 59.03 95.22 61.87 97.99   2021 80.52 76.98 59.46 93.58 63.72 98.80   Avg. 79.99 74.84 58.82 94.46 62.81 98.49 78.24 Table 3 of result.md

SEED A - D A - W D - A D - W W - A W - D AVG. 2019 79.9 77.5 58.9 95.0 64.6 98.4   2020 81.5 75.8 61.6 96.0 63.3 99.0   2021 80.9 77.5 60.2 94.8 62.9 98.8   Avg. 80.8 76.9 60.3 95.3 63.6 98.7 79.3 Here is what I did

  1. Clone repo
  2. Download office-31 and place it under object/data folder and create amazon_list.txt, dslr_list.txt, webcam_list.txt in the same folder using below code.

    import glob
    import os
    import random
    
    def load_data_list(root):
       class_list = glob.glob(os.path.join(root, '*'))
       data_list = []
       for label, class_path in enumerate(class_list):
           data_list.extend([f'{path} {label}\n' for path in glob.glob(os.path.join(class_path, '*'))])
       random.shuffle(data_list)
       return data_list
    
    def save_to_txt(save_path, data_list):
       with open(save_path, 'wt') as f:
           f.writelines(data_list)
    
    for dataset in ['amazon', 'dslr', 'webcam']:
       data_list = load_data_list(f'data/office/{dataset}/images')
       print(dataset, len(data_list))
       save_to_txt(f'data/office/{dataset}_list.txt', data_list)
  3. Train model using below command. I tried seed as follow: 2019, 2020, 2021.
    python3 image_source.py --trte val --da uda --output ckps/source/ --gpu_id 0 --dset office --max_epoch 100 --s 0

If anyone can spot any problems, I will be really appreciate it.

Hi, I think the reproduced results are normal. If you really want to obtain similar results, you can try the same environment as ours, e.g., the same version of different python packages.

hankyul2 commented 2 years ago

Thank you for your reply. I understand your answer.