IshiKura-a / ModelGPT

13 stars 0 forks source link

Regarding image classifier baseline model tuning #3

Closed sorobedio closed 2 months ago

sorobedio commented 2 months ago

Hello,

Sorry to bother you again. When you mentioned that the model is fine-tuned on DSLR and Amazon photos, did you mean that all the images from both datasets were combined into one folder, with images from the same classes grouped together, and then fine-tuned or LoRA-tuned on this mixture? I am referring to the baseline. I tried running the bs_img_cls.sh script but couldn't see where it learns from both sets. i would like to train the image classifier baseline.

Thank you.

IshiKura-a commented 2 months ago

Hi, thank you for your question!

The file arrangement of dataset Office-31 is unchanged. As to training on domain DSLR and Amazon, please see main_img_cls.py, where we have 2 args: domain and zero_shot_domain. domain means all the data we need, while zero_shot_domain means we need this data for only test. So there is an accordingly code logic to leave zero_shot_domain data untrained in line 104. I put it here:

        train_args = TrainingArguments(do_train=(task_name not in args.zero_shot_domain),
                                       do_eval=(task_name not in args.zero_shot_domain),
                                       do_test=True,
                                       n_epochs=0,
                                       rank=rank)

Back to the question, to reproduce the results of CV task, you only need to download the dataset and customize backbone, output_dir, dataset_dir and run main_img_cls.py.

sorobedio commented 2 months ago

tthank you. i was talking aboutimg_cls.py in baseline folder I want to reproduce the fine-tune method by running bash scriptss/bs_img_cls.sh but I could not get the 100\% on DSLR. so I wonder if I am running properly. i am only interested in the Finetune baseline for the time being.

I set domain as the same as the dataset such as amazon in img_cls.py and comment if args.load_ckpt: section then run .for seed 2025 I got Acc = 0.6643109321594238 Acc@3 = 0.8127208352088928 Acc@5 = 0.8692579865455627 435.5647270679474 for amazon and DSLR with seed 2025 Acc = 0.7843137383460999 Acc@3 = 0.960784375667572 Acc@5 = 0.960784375667572 645. much below 100\%. i wonder how to get the 100%

IshiKura-a commented 2 months ago

All the baselines are trained and tested on a certain sub-task like (only DSLR or Amazon).

The result would be senstive to the seed choice. Please split the data and train the baseline under seed 2024.

sorobedio commented 2 months ago

ok thank you