Open johnlockejrr opened 1 month ago
Should I try to train it as center line
and not top line
as normal for Hebrew?
it will not help I think. use the api to improve the polyons by calculating average line distance and extrapolating from there.
it will not help I think. use the api to improve the polyons by calculating average line distance and extrapolating from there.
Is not a problem with the dataset but the model output. Using API to do what? The model should perform better.
Maybe you are aware of a Hebrew segmentation model that can properly handle nikkud and cantillation?
I try to train a segmentation model for a modern printed Judaeo-Arabic dataset. The problem I face is that in the trained model I mainly loose vowel signs below the line. What can be done? I tried from scratch training and finetuning.
Manual segmentation as groundtruth:
Segmentation with the new trained model (small data, is preliminary):