mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
751 stars 131 forks source link

Training advice #445

Closed lamaeldo closed 1 year ago

lamaeldo commented 1 year ago

Hello @mittagessen, I have some questions regarding the segmentation training: -First, what would you advice as metrics to watch? I read #262 and #263 , but I really struggle to make sense of the metrics. In all my attempts, accuracy and mean_accuracy skyrocket in a few epochs to >.99 values, whereas freq_iu rises gradually and mean_iu lags behind. Is, then, mean_iu the best metric to focus on? -On the topic of the lr scheduler, do you have any information on which usually performs the best?

mittagessen commented 1 year ago
colibrisson commented 1 year ago

I usually fine-tune from the default model for a fixed number of epochs (50 for reasonably sized datasets) with a cosine schedule.

Do you think segmentation keeps improving even after the weighted IoU curve flattens out?

lamaeldo commented 1 year ago

Thanks for the advice!

PonteIneptique commented 1 year ago

May I recommend turning this into a page on the doc ?

mittagessen commented 1 year ago

@PonteIneptique @colibrisson @rohanchn If any of you feel like integrating all of this into the training notes I'd gladly merge a pull request. I probably won't find the time in the next few days.