Repo to share the largest annotated Forms data used in the paper "Document Structure Extraction using Prior based High Resolution Hierarchical Semantic Segmentation" accepted at ECCV 2020: Paper link
A part of the dataset is now available here: Data link. The dataset comprises of 400 annotated forms with rich and fine-grained annotations for different levels of structures in a hierarchy - text, widgets, fields, tick-type choice groups etc.
Please cite our papers if you use the data in the above link.
{
@inproceedings{sarkar2020document,
title={Document Structure Extraction Using Prior Based High Resolution Hierarchical Semantic Segmentation},
author={Sarkar, Mausoom and Aggarwal, Milan and Jain, Arneh and Gupta, Hiresh and Krishnamurthy, Balaji},
booktitle={European Conference on Computer Vision},
pages={649--666},
year={2020},
organization={Springer}
}
{
@inproceedings{aggarwal2020multi,
title={Multi-Modal Association based Grouping for Form Structure Extraction},
author={Aggarwal, Milan and Sarkar, Mausoom and Gupta, Hiresh and Krishnamurthy, Balaji},
booktitle={The IEEE Winter Conference on Applications of Computer Vision},
pages={2075--2084},
year={2020}
}
This dataset is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.