Closed rockyzhengwu closed 4 years ago
Hi, how did you download the dataset? I have received the download link from author, but cannot download the dataset by "wget" and "browser". Could you introduce you means?
@lumiaomiao sorry, I can't remember about that
@rockyzhengwu The method we used can not guarantee that all tables are marked. Table will be detected and labeled automatically by code, which means some error may cause a little table unlabeled.
We randomly sample 1,000 examples from the dataset and manually check the bounding boxes of tables. We observe that only 5 of them are incorrectly labeled, which demonstrates the high quality of this dataset.
For detail, see our paper.
Thanks
I found there is some problem in the data , table not labeled . two example from Word.json