poor box width regression on text detection

travisCxy commented 1 week ago

hello, thank you for your code. I am training a yolov9-model for document image layout detection。I got a good map on my validate set。But the question is text detection some time got a bad width regression。can u help me? 0cdd69db56714fbc89b8845eb3f6e11f_sm_yolov9

ankandrew commented 5 days ago

Some questions:

Did you try diff input resolution than 640, i.e. lower 416?
How big (# samples) is your training data?
Which model are you using, is it pre-trained with COCO (weights provided by repo)?

Also, double check that mixup augmentation is not ruining your training. Try seeing if augmentation is what you expect. Below is a script I use to visualize the augmentation:

https://github.com/ankandrew/yolov9/blob/8fecc650bebf7348a6372f43b668b344de070129/visualize_augmentation.py

travisCxy commented 3 days ago

@ankandrew hello

i am using a bigger size 1024 for training my model, because the original document image is all high resolution
I have 44000 training data, i think it is enough to train the model
I am using yolov9-e and load the pretrained weights with coco I check my augmentation, you are right, i didnt close the mixup augmentation. I check the augmentation using your scipts, than i close mosaic and copy_paste, i will train one more time with current setting. by the way, i reading the code about compute loss. the bbox loss mainly focous on iou, I have doubt with the iou loss is not helpful for accurate bbox regression. So i change the loss to l1 loss, but I got a worse result, do you have any idea?

WongKinYiu / yolov9

poor box width regression on text detection #518