ibm-aur-nlp / PubTabNet

Other
380 stars 79 forks source link

Pubtabnet and Publaynet #11

Open michaelliu2 opened 3 years ago

michaelliu2 commented 3 years ago

Hey, are the table images in Pubtabnet taken from pages in Publaynet? I'm interested in table-cell annotated images in the context of their original page, instead of just cropped table images.

ctensmeyer commented 3 years ago

They overlap slightly, but no, PubTabNet is not a subset of PubLayNet.

ajjimeno commented 3 years ago

I confirm what Christopher said. Both data sets are derived from PubMed Central, but there is no specific dependency among the data sets.

On Wed, Feb 3, 2021 at 3:19 AM Christopher Tensmeyer < notifications@github.com> wrote:

They overlap slightly, but no, PubTabNet is not a subset of PubLayNet.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ibm-aur-nlp/PubTabNet/issues/11#issuecomment-771755158, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6BZDJKTRGCA7WCD52SBMLS5AQ25ANCNFSM4PJPUMYQ .