wenwenyu / PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
https://arxiv.org/abs/2004.07464
MIT License
553 stars 191 forks source link

Not getting the expected result when using custom Dataset #70

Open ninjakx opened 3 years ago

ninjakx commented 3 years ago

I created around 50 Data samples on Patent Dataset containing 3 entities as place, author, patent_num. Data

Output I am getting Using: "iob_tagging_type":"box_and_within_box_level"

author entity is getting extracted in most of the test samples but not the other entities and for few of the test sample there is no prediction at all.

All the entities are equal in number.(for every document these entities exist)

Queries: 1) How can I improve the result? 2) Is it because of feeding low sample dataset? 3) What is the acceptable no. of data samples to yield the better result?

NeerajAI commented 3 years ago

Hi Can you plz how to prepare the traindata with 8 coordinates, my annotation return 4 values - x,y,w and h. ???

ninjakx commented 3 years ago

@NeerajAI : You can create 8 pairs using x,y,w,h

(x, y), (x, y+h), (x+w,y+h), (x+w,y)