microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.62k stars 2.5k forks source link

How to interpretation text_predictions.txt #244

Open yuanhuang0825 opened 4 years ago

yuanhuang0825 commented 4 years ago

DATE: S-QUESTION TO: S-QUESTION FROM: S-QUESTION 83635935 O JUN O 05 O 97 O 02:00 O PM O LOEWS O CORP O P. O 1/ O 6 O LOEWS B-HEADER CORPORATION E-HEADER 667 O Madison O Avenue, O New O York, O N. O / O 10021 O 8087 O (212) O 545 O 2920 O Fax O (212) O 935 O 6801 O BARRY B-ANSWER HIRSCH E-ANSWER Senior B-QUESTION Vice I-QUESTION President E-QUESTION Secretary B-QUESTION & I-QUESTION General I-QUESTION Counsel E-QUESTION FAX B-HEADER CONFIDENTIAL E-HEADER June B-ANSWER 4, I-ANSWER 1997 E-ANSWER Dr. B-ANSWER Spears/ I-ANSWER A. I-ANSWER J. I-ANSWER Stevens/ I-ANSWER R. I-ANSWER Milstein E-ANSWER Barry B-ANSWER Hirsch E-ANSWER TOTAL B-QUESTION NUMBER I-QUESTION OF I-QUESTION PAGES I-QUESTION INCLUDING I-QUESTION THIS I-QUESTION COVER I-QUESTION SHEET- O 6 O IF O YOU I-QUESTION DO I-QUESTION NOT I-QUESTION RECEIVE I-QUESTION ALL I-QUESTION THE I-QUESTION PAGES, I-QUESTION PLEASE I-QUESTION CALL O CAROL B-ANSWER DOKTORSKI I-ANSWER AT I-ANSWER (212) I-ANSWER 545- I-ANSWER

  1. E-ANSWER OUR B-QUESTION FAX I-QUESTION NUMBER E-QUESTION (212) B-ANSWER 935 I-ANSWER 6801 E-ANSWER THIS O TRANSMISSION O IS O INTENDED O ONLY O FOR O THE O USE O OF O THIS O INDIVIDUAL O OR O ENTITY O TO O FROM O IT O IS O ADDRESSED. O AND O MAY O CONTAIN O INFORMATION O THAT O IS O PRIVILEGED O CONFIDENTIAL O AND O You O YOU O ARE O THAT O MY O DIRAYDOTIG O DISAISONICAR O OF O THIS O COMKONICATION O TA O ATRICTLY O PROHIBITED O HAVE O RECEIVED O TINCOFMONTTIOS O PYTHOND O ROU O ORIGDOL O YILSON O ATITE O ABOVE O ADDR359 O VIA O POSTAL O SERVICE O THANK O YOU. O

The problem is: DATE: S-QUESTION

correspond to

June B-ANSWER 4, I-ANSWER 1997 E-ANSWER

How to know a Question correspond to its Answer?

Model I am using (LayoutLM ):

wolfshow commented 4 years ago

@yuanhuang0825 The original FUNSD dataset provides the linking infomraiton between Q and A. Please refer to the FUNSD dataset.

yuanhuang0825 commented 4 years ago

@wolfshow I can't find it. Is it in the paper? Can you show me? Thanks!

SandyRSK commented 4 years ago

Hi @yuanhuang0825 @wolfshow I have an doubt I am doing receipt understanding. I can able to get the text.txt, text_box.txt files as an input to the model for predicting. after running the code run_seq_labelling.py (as --do predict) I got text_prediction.txt but it has only first 16 lines with tags. I don't know why I did't get the remaining text in that.

kindly help me out in this Thank you

wolfshow commented 4 years ago

@yuanhuang0825. The infomation is in the json file of the dataset.

sreejith3534 commented 3 years ago

Hi @yuanhuang0825 @wolfshow I have an doubt I am doing receipt understanding. I can able to get the text.txt, text_box.txt files as an input to the model for predicting. after running the code run_seq_labelling.py (as --do predict) I got text_prediction.txt but it has only first 16 lines with tags. I don't know why I did't get the remaining text in that.

kindly help me out in this Thank you

Did you check the train.log is there any warning like : "Maximum sequence length exceeded: No prediction for '%s'."