Open logan-markewich opened 1 year ago
not really though. The OCR seems really important
💔
Yea I just gave up with V3. Been experimenting with LiLT and that works pretty well, but I wish they would publish a large version lol
Hello @logan-markewich ,LiLT is not trained for DocVQA, no?
@StalVars just gotta fine-tune it yourself :) It works pretty well, but I wish there was a large version
@logan-markewich , Thanks for the quick reply. May I ask how good is the anls score on dev/test with LiLT?
@StalVars i've been working with a custom dataset (DocVQA + a bunch of my own annotated data)
If I had to approximate it, I'd say LiLT is comparable to LayoutLMV2-base (maybe just a tiny bit worse). But, LiLT has a less restrictive license lol
@logan-markewich , ok, thanks again for the quick response :)
@logan-markewich Could you please share the fine-tune code on the DocVQA dataset? Thanks a lot!
I'm struggling as well to get good accuracy out of LayoutLMV3. Compared to V2, V3 seems much worse actually.
Did you ever get any better results?