Closed Sharathmk99 closed 4 years ago
Hi,
I tried using the GCN variant by Kipf et al in a transductive setting for the kind of problem you are suggesting. I was able to cluster text objects into table area
, contact information
, and other such general classes. The problem with extracting specific fields like total amount
and company name
is that it generates high class imbalance since there will only be one node belonging to the class total amount
in a document at a time.
A possible solution to your problem statement could be to use a different Graph Net implementation such as GraphSAGE which works in an inductive setting and can be trained on multiple documents of the same type instead of generating embeddings using a single invoice at a time, as in the case of Kipf et al. With GraphSAGE you can give your network examples of the class total amount
across multiple documents as part of your training set and then generate embeddings for an unseen document.
You can use the code here for converting your document into a graph and then try out GraphSAGE or similar inductive approaches.
Hope this helps!
@dhavalpotdar Thank you very much for response and details. I'll try out to build graph for multiple documents. If any question can I contact you?
It's my pleasure. And yes you can contact me on my email regarding any questions. I'll be glad to help.
Hi,
Amazing work to understand structure documents. Is it possible to extract actual value for given entity. For example invoice #, total amount, company name, etc...
Thank you