Hy2MK / CGCD

MIT License
6 stars 1 forks source link

datasets #2

Open jiaolifengmi opened 3 months ago

jiaolifengmi commented 3 months ago

no datasets split code

Hy2MK commented 3 months ago

Thank you for the feedback. Could you provide a bit more details on the specific issues or areas you mentioned? Understanding the specifics will help me address your concerns more effectively.

jiaolifengmi commented 3 months ago

run python train.py \ --model resnet18 \ --dataset cub \ --alpha 32 \ --mrg 0.1 \ --lr 1e-4 \ --warm 5 \ --epochs 60 \ --batch_size 120 \

the issue('datasets/CUB200/train_o' does not exist) is appear , so may i ask how the train_o file is formed.

Hy2MK commented 3 months ago

The folders, 'train_o' and 'valid_o,' are for supervised learning in the initial step. As shown in Table 2, the classes range from 0 to 160, and 80% of the samples are included.

jiaolifengmi commented 3 months ago

Could you please provide the code to generate these files. Or, users need to write a code to generate these files themselves.

Hy2MK commented 3 months ago

I'm very sorry, but due to company rules, I cannot share any additional code updates. But I'd like explain how to do it in as much detail as I can. Using 'os.walk(...)' you can get lists of folders and files. You can then select 80% of the samples (files) using 'np.random.shuffle(...)' and copy those files to a new folder named 'train_o' using 'shutil.copyfile(...)'.

Hy2MK commented 3 months ago

Did you implement code for splitting the dataset?

jiaolifengmi commented 3 months ago

Thank you very much for your feedback. I haven't implemented the code for dataset splitting yet. One problem is that when splitting the dataset, it involves the choice of seed. If you do not provide a generation code for a similar dataset, there is no way to fully reproduce the experimental results mentioned in your paper. No. Know if there is a solution for this on your side.

Hy2MK commented 3 months ago

The reported results are not based on a single performance but represent the mean over three runs. This means that the datasets we used were not singular. Therefore, the seed is not crucial.