zzzDavid / ICDAR-2019-SROIE

ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
MIT License
385 stars 132 forks source link

Performance for higher number of classes in classification task 3 #12

Closed NISH1001 closed 4 years ago

NISH1001 commented 4 years ago

Hi, I ran the barebone code and it gave (as mentioned) good results. However, I tried with my own dataset (obviously with compatible data type as in the code). So, I am not able to get any good performances. I have 14 classes. And have also done weight adjustment for cross-entropy loss. Played a little with embedding size. But couldn't get any digestive results.

Is there anything that can be done to make it better? I have tried doing preprocessing, changed hidden sizes, embedding sizes, none seem to give good results.

Has anyone tried it on a large number of classes other than the 5 mentioned? Nevertheless, loved how the code is seamless and wasn't any problem right away to run.

patrick22414 commented 4 years ago

Thank you @NISH1001 ! Unfortunately, I haven't tried this on a problem with higher number of classes, so I can't provide you with too many clues. If you are aware of some open datasets that would allow me to test it with higher number of classes, maybe I can try it and see I can find anything.

NISH1001 commented 4 years ago

@patrick22414 Sorry I don't know of any open datasets. But, I think the architecture needs improving to account for higher classes. Or it might be possible that the dataset I tried had ambiguous data like from invoices where sender and receiver had similar fields.

What I did was treat each "sender.field_type" and "receiver.field_type" as separate labels. That might be one thing that model didn't generalize using simple LSTM. Maybe we can use CNN-based embeddings (to make use of positions of characters)?

Just a thought. Anyway, thanks for this open code. Was insightful.