Open prabhakar-sivanesan opened 3 years ago
@prabhakar-sivanesan : Is it detecting all the entity in your custom dataset? How many data samples did you pass to the model to get the better result?
@ninjakx I was training for only 5 entities and I used about 70 samples with 70/30 split. I was able to get better results for that.
@prabhakar-sivanesan Hi Prabhakar, would you let me know which annotation tool you used for preparing the custom dataset?
Hi, firstly thanks for the model it worked perfectly good on the custom dataset. But I have two doubts in preparing the tsv data for training.
1) When I have 3 words associated to one entity, does all the three words has to seperatly annotated in tsv file or they have to be combined into one ?
Example, this is the data
In shipping address column, Kothuri Sai Kiran is a name. My OCR model gives these 3 words separatly as Kothuri, Sai and Kiran. So while preparing the tsv file, can I annotate it as 3 different row like this,
18,1009,490,1198,490,1198,553,1009,553,Kothuri,name
19,1206,495,1501,495,1501,552,1206,552,Sai,name
20,1619,501,1707,501,1707,560,1619,560,Kiran,name
or all three words has to be combined like this,
18,1009,490,1707,501, 1707,560,1009,553, Kothuri Sai Kiran, name
2) When you see the Billing address column, I have the same name Kothuri Sai Kiran. Is it possible to tag this name to the same entity "name" ? In a nut shell, Can I have multiple ocr data tagged to one entity for a single image file ?
Looking forward to your response.