wenwenyu / PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
https://arxiv.org/abs/2004.07464
MIT License
553 stars 191 forks source link

Prediction #50

Closed compadrejavo closed 3 years ago

compadrejavo commented 3 years ago

I really like your PICK project and want to thank you for sharing it, also I have a question about the code.

I am currently trying to export the prediction results from a trained model of PICK using the test.py script, but it just shows the predictions defined on the transcript of the .tsv file of the image provided, it doesn't export the other predictions, so my question is:  Where are the predictions stored? I tried looking through the code but didn't manage to find the variable that contains them. I would be very grateful if you could give some insight about this.

A funny thing is that the script test.py just prints whatever I put in the .tsv, I replaced the transcripts of a .tsv with "asda sdas" and on the output folder got the same "asda sdas" on the .txt

Regards.

wenwenyu commented 3 years ago

Thank you for your interest in our work.

The main process of test is bio tags -> spans -> entities -> text segments. https://github.com/wenwenyu/PICK-pytorch/blob/11fb67703d19b585fc760345bf6f4c03ff11fa10/test.py#L77 bio_tags_to_spans method will filter outside tags, and calculate spans betteen beginning and inside of tags. The final prediction is the corresponding text segments of entity. Hope it helpful to you.

compadrejavo commented 3 years ago

Thanks a lot for your answer, I got an output from the spans variable, however it is not what I am looking for, that output is still tied to the .tsv file.

What I am searching for is how to get model predictions from an image without using external data from a .tsv file, predict just with the image

Regards.

ninjakx commented 3 years ago

@compadrejavo : Do you have any success in getting predictions from an image alone?

compadrejavo commented 3 years ago

No, PICK requires a file with bounding boxes and transcripts plus the image, however it can be easily generated with a OCR model (Tesseract, ABBY, etc.)

ninjakx commented 3 years ago

@compadrejavo : did you try key information extraction using GCN alone?