box_level argument and model evaluation

wenwenyu / PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

https://arxiv.org/abs/2004.07464

MIT License

559 stars 193 forks source link

box_level argument and model evaluation #51

Open n0ct4li opened 3 years ago

n0ct4li commented 3 years ago

Hello,

In data/Readme.md, you say « if iob_tagging_type is set to box_level, this folder will not be used, then box_entity_types in file_name.tsv file of boxes_and_transcripts folder will be used as label of entity. otherwise, it must be provided.« about the entities folder. So you mean entities folder is not used during evaluation? But how can you evaluate model without knowing the ground truth for entities? Or I don’t understand something

Thanks

n0ct4li commented 3 years ago

If my model prediction is :

{'adress' : '87', 'adress : 'Evergreen' 'adress : 'Toto'}

And in entities folder, I have :

{'adress' : '87 Evergeen Lala'}

How is computed the F1-score? In the article you said you are using same as EATEN metric (meF). But I don't see in the code a comparison between two texts.

vulehoangphuc commented 3 years ago

If my model prediction is :

{'adress' : '87', 'adress : 'Evergreen' 'adress : 'Toto'}

And in entities folder, I have :

{'adress' : '87 Evergeen Lala'}

How is computed the F1-score? In the article you said you are using same as EATEN metric (meF). But I don't see in the code a comparison between two texts.

I have the same issue like you. Have you found the solution ?

ninjakx commented 3 years ago

n0ct4li commented 3 years ago

After reading the code I saw that during training and evaluation he used span-based f1-score metric. I decided to use "box_level" argument so I don't have any entities folder.

Span-based metric means that for my exemple I correctly detect the spans '87' and "Evergreen" (TP), "Toto" is a FP span and "Lala" is a false negative span.

If you want to use eaten metric which compares text to text, you must apply postprocessing to get a single text out of all of your spans that Pick predicts. Using postprocessing you can eliminate some false positives spans.