apple / ml-cvnets

CVNets: A library for training computer vision networks
https://apple.github.io/ml-cvnets
Other
1.79k stars 228 forks source link

Dose batch sampler make model learns same weight in every training repeatation? #49

Closed YHYeooooong closed 2 years ago

YHYeooooong commented 2 years ago

Hi! Thanks for sharing great work!

I have a question about the sampler

I am working on some training examples with variable_batch_sampler and batch sampler. I'm trying to get average ACC over 5 times training (repeat training 5 times in the same setting) Best validation ACCs may be similar (not the same) in every repeated trained model with both samplers. But when I used the batch sampler, the all best val ACC of the repeated model are the same. Is that right?

I'm working with this yaml file

0707_mobilevits_real_defualt_lr0.0001_cosine_advanced_multiscale.docx

with this shell script command

for iter in '1' '2' '3' '4' '5'
do

    CUDA_VISIBLE_DEVICES=3 cvnets-train --common.config-file ./config/classification/CBIS-DDSM_2c_womulti/0707_mobilevits_real_defualt_lr0.0001_cosine_advanced_multiscale.yaml --common.results-loc ./results/2class_iters_wo_multi/iter$iter --model.classification.finetune-pretrained-model --model.classification.n-pretrained-classes 1000 --model.classification.pretrained ./weights/mobilevit_s.pt
done

and when I repeated training 5 times, best val ACC (same value 72.5467) appear in 268 epoch. comparing_iterations.xlsx

This result is right?

Also I modified some code because to tracking the training information.

modified code is under hear

code.zip

sacmehta commented 2 years ago

Hi, if you are looking to measure run-to-run variation, then you should change the ransom seed. Right now, you are using a default seed of 0.

YHYeooooong commented 2 years ago

Oh no...

Thank you for fast comment!

So I have to delete this line in sampler code https://github.com/apple/ml-cvnets/blob/84d992f413e52c0468f86d23196efd9dad885e6f/data/sampler/batch_sampler.py#L54 https://github.com/apple/ml-cvnets/blob/5f5dbc74e3fff47b07cf6b1c4c06941423bc08bc/data/sampler/batch_sampler.py#L157

and also have to edit random seed values in yaml file totally different from each other random_seed_in_yaml

Am I understanded correctly?

YHYeooooong commented 2 years ago

And I have a small additional question! When I used multi_scale_sampler in the same settings, the best Val accs are not the same as each other. But at that time, I also used same random seed value (default 0). Is there any reason why the model produces different best val accs?

sacmehta commented 2 years ago

You only need to add random seed in config file. No change in code.

It depends on several factors, including dataset size, optimization updates, run to run variance etc.

Try to train batch and multi-scale sampler for the same number of iterations and see if you observe any performance gains.

YHYeooooong commented 2 years ago

Thank you for your kind response! I will change the seed value!