wangwen-whu / WTW-Dataset

This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.
Other
152 stars 15 forks source link

在根据xml标注生成html时,发现很多标注错误 #7

Closed BlackDriver closed 2 years ago

BlackDriver commented 2 years ago

我以xml中各单元格的start_row,start_col,end_row,end_col为标准,生成了html标注,发现很多图中单元格的跨行跨列有问题,本应没有跨行跨列的表格被标注成了跨多行/多列,确认到该现象一旦出现,在一系列相似的图片中都会出现。 例如:mit_google_image_search-10918758-d6cc32fbb935608d01f71d2c3daa7ebd6634aabb

这些标注问题貌似对邻接关系与TEDS的评测都有很大影响。

BangdongChen commented 2 years ago

@BlackDriver @wangwen-whu I also faced the same problem.

wangwen-whu commented 2 years ago

@BlackDriver,@BangdongChen Hi, there is something wrong in the testset, so we have update the xml files in September, you can download the test-xml-revise.zip, this is the newest xml files, and about the new results we update it in this project, too.

BlackDriver commented 2 years ago

Thank your for your reply, I found the revised annotations in the link

kasyoukin commented 2 years ago

est-xml-revise.zip 中仍有不少错误,如 0abd8c28799d176daf5839a227811b035fbf10a3.jpg