Information extraction when bounding boxes are not present (prediction on unseen data)

wenwenyu / PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

https://arxiv.org/abs/2004.07464

MIT License

553 stars 191 forks source link

Information extraction when bounding boxes are not present (prediction on unseen data) #62

Closed AbhayPadda closed 3 years ago

AbhayPadda commented 3 years ago

Hi,

I was just exploring this model and found it useful for information extraction from scanned images. I need to know how can we make predictions on unlabelled data after we have trained the model?

By unlabelled data, I mean images where we do not have bounding box details.

Thanks in advance.

AbhayPadda commented 3 years ago

@wenwenyu It would be really great if you can help me with this.

wenwenyu commented 3 years ago

Hi,

The bounding box and transcripts of unlabelled data can be predicted by the OCR system which means you need to train a text detection and recognition model. This is out of the scope of this paper.

Hope it can help you.

AbhayPadda commented 3 years ago

Thanks @wenwenyu for your reply.