Open zakajd opened 3 years ago
IT WORKS! Values are for validation on a whole training set
logs/resnet101_384_arcface_fp16_gem_light
Acc@1 0.59785, CMC@10 0.80060, mAP@10 0.60978, target 0.60381
LB: 0.373254
Val: Acc@1 0.81258, CMC@10 0.92955, mAP@10 0.80109, target 0.80683 logs/genet_normal_384_arcface_fp16_gem_light LB 0.562599
logs/genet_normal_384_cosface_fp16_gem_light/ 0.505418
logs/genet_normal_512_cosface_fp16_gem_light LB 0.511652 (Model didn't finish training!!!) Re-submit later
logs/genet_normal_512_arcface_fp16_gem_hard_1 Just embeddings: Val: Acc@1 0.85475, mAP@10 0.85169, target 0.85322. LB 0.571094 With DBA: Val: Acc@1 0.85338, mAP@10 0.86104, Target 0.85721 LB 0.566041 With aQE: 0.570069
Looking on images from test set. There 2 common scenarious:
All images are correct. Which means they are easy to detect due to low variation in shape and scale.
Result is complete garbage, no relevant images at all.
Another important thing is that some images are similar to the ones from train dataset. Som it can be used during test to improve quality
After big rework and fixing 2 important bugs, results become even better!
genet_small model trained 3 epochs on 512 images:
Val: Acc@1 0.56402, mAP@10 0.56598, Target 0.5650011025777197, mAP@R 0.53351 Reality: 0.282339
genet_small model trained 4 epochs on 384 images: Val: Acc@1 0.76944, mAP@10 0.75992, Target 0.76468, mAP@R 0.72855 during predict Reality: 0.455 Val: Acc@1 0.77862, mAP@10 0.77543, Target 0.77702, mAP@R 0.75033 with DBA and sum Reality: 0.466035 Val: Acc@1 0.77106, mAP@10 0.77394, Target 0.77250, mAP@R 0.75368 with DBA and aQE and sum Reality 0.466021
Take away: DBA really helps and for better embeddings it will help even better.
[09-19 02:10] - Val: Acc@1 0.68424, mAP@10 0.67938, Target 0.68181, mAP@R 0.64531 after train Val: Acc@1 0.68424, mAP@10 0.67963, Target 0.68194, mAP@R 0.64531 Reality: 0.382307
[09-19 06:06] Val: Acc@1 0.87609, mAP@10 0.85634, Target 0.86621, mAP@R 0.83475 Reality: 0.562475
Val: Acc@1 0.87554, mAP@10 0.87444, Target 0.87499, mAP@R 0.86127 with DBA and aQE 0.578106
Answer: bad labels! There are 408 direct duplicates. There is also whole classes that repeated multiple times. Like DIGIX000030 and DIGIX00001390 are same floor scales (напольные весы). And some of them in train, some others in test.
Update: Found 21 pair of classes that contained duplicate images and merged them. Also deleted one of the duplicate images using md5 hash. Did it help? No :( Metrics still skyrocket into the sky starting from the very first epoch. Loss on validation is still increasing from the very beginning.
Idea: Show the most similar image from another class for all training classes. Check them, maybe will find even more duplicates?
(for @bonlime )
Goal: for each query find image from train, if it exists. Then we can use all images of same class to compare with query results (not sure how exactly)
genet_normal_384_light_arcface80_20_em Val: Acc@1 0.93915, mAP@10 0.9357, Target 0.9374, mAP@R 0.9266. LB: 0.655
genet_normal_384_light_arcface80_20 Val: Acc@1 0.9206, mAP@10 0.9176, Target 0.9191, mAP@R 0.9060
Webinar about Image retrieval: Color schemes other than RGB can boost performance. Плохая контрастность ухудшает качество детекции. ВЫравнивание гистограм может сделать объекты более видимыми. Локальное выравнивание > глобального
Вычисления градиентов стандартными методами. greyscale, RGB,
Поиск особенностей при условии: отражений, затемнений, 3D структуры
Started training baseline models:
Results aren't good, loss decreasing very slow. For now only loss is tracked, metrics not yet measured.
There may be a bug in the code, so I'll try to just learn a classification task using CrossEntropy Loss first and see if it's possible to learn anything. [UPDATE WITH RESULTS]