Closed PetervanLunteren closed 10 months ago
I just realised that I pulled the latest image (zaandahl/mewc-train
) instead of version 1.0, like the documentation says (zaandahl/mewc-train:v1.0
). Could this have affect?
BTW: unrelated, but I believe there is a typo in the documentation, as the tag for version 1.0 is actually 1.0
instead of v1.0
.
Hi Peter,
I haven't tested with a very high number of samples, and I'm not sure why the accuracy drops. It might be worth gradually increasing the number of samples from 10000 to see where the problem occurs. The learning rate looks like it changes drastically at the stage where accuracy is lost so it might be worth testing a different schedules for magnitudes and/or dropouts.
The N_SAMPLES parameter samples with replacement up to the value, so sparse classes are oversampled.
I'll adjust the documentation with the version. The GitHub tag is v1.0.x but DockerHub uses 1.0.x so I probably got the two confused. :)
Cheers, Zach
Hi Zach,
It did the same thing with N_SAMPLES=15000
a bit further down. I'll try some different approaches. Thanks!
Cheers,
Peter
First of all: thanks for building this awesome repo! This is really helpful. I am running a training with the default settings, except for
N_SAMPLES=35000
. At some point during stage 1/3, the accuracy drops from 0.94 to 0.03 in one epoch. Any idea what is going on? See the console output below.When running a training on your example dataset with
N_SAMPLES=4000
, it worked perfectly. It also worked well when I trained on my own dataset withN_SAMPLES=10000
.I have a dataset of about 900.000 images in total, ranging from classes with a few 100 images to classes with more than 100.000 images. Hence, I prefer to not downsample my dataset too much... Or am I misinterpreting
N_SAMPLES
? I assumed that it upsampled small classes and downsampled large classes.Or do I need to adjust other default values too, if I'm training with a large value for
N_SAMPLES
?Thanks in advance :)