Have you tried on FUNSD data set

wendilinplay commented 3 years ago

Hi, thanks for the great work and sharing the code. I'm wondering if you've tried on FUNSD dataset. FUNSD is always our baseline for benchmarking and another reason I didn't use SROIE is that LayoutLMV2 showed a big jump on FUNSD's performance but not SROIE's. Anyway, right now with all the default settings I only got 0.60 F1 on FUNSD and I guess there must be something wrong. So I'm reaching out to see if I have the luck that you happen to test on FUNSD and can share the F1, so that I could know where I'm heading.

florianbussmann commented 2 years ago

These are my results on FUNSD dataset using PICK with default settings but increased MAX_BOXES_NUM to 220

+----------+----------+----------+----------+----------+
| name     |      mEP |      mER |      mEF |      mEA |
+==========+==========+==========+==========+==========+
| answer   | 0.68379  | 0.704373 | 0.693929 | 0.704373 |
+----------+----------+----------+----------+----------+
| header   | 0.467577 | 0.373297 | 0.415152 | 0.373297 |
+----------+----------+----------+----------+----------+
| question | 0.601961 | 0.646316 | 0.62335  | 0.646316 |
+----------+----------+----------+----------+----------+
| overall  | 0.639153 | 0.660393 | 0.6496   | 0.660393 |
+----------+----------+----------+----------+----------+

wendilinplay commented 2 years ago

Yes, it seems to me that PICK can perform well on small size / sort of image-rich documents but not text-rich documents. On Wed, Sep 8, 2021 at 2:24 AM Florian Bussmann @.***> wrote:

These are my results on FUNSD dataset using PICK with default settings but increased MAX_BOXES_NUM to 220

+----------+----------+----------+----------+----------+ | name | mEP | mER | mEF | mEA | +==========+==========+==========+==========+==========+ | answer | 0.68379 | 0.704373 | 0.693929 | 0.704373 | +----------+----------+----------+----------+----------+ | header | 0.467577 | 0.373297 | 0.415152 | 0.373297 | +----------+----------+----------+----------+----------+ | question | 0.601961 | 0.646316 | 0.62335 | 0.646316 | +----------+----------+----------+----------+----------+ | overall | 0.639153 | 0.660393 | 0.6496 | 0.660393 | +----------+----------+----------+----------+----------+

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/wenwenyu/PICK-pytorch/issues/81#issuecomment-914954245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL2G6PLZE4WN5UKPMFWGXEDUA36SRANCNFSM4XC6CVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

JzjSunshine commented 6 months ago

In my opinion，pick is not suitable for the funsd dataset. PICK is designed to extract key-value pairs in a one-to-one correspondence.

wenwenyu / PICK-pytorch

Have you tried on FUNSD data set #81