Open wilfreddesert opened 3 years ago
I have a similar problem, please let me know if you have found the solution to have PICK work with word-level annotations.
Hi @wilfreddesert were you able to get answers to your question? Would really love to know about how did you deal with word entities.
Hi @wenwenyu
I cannot wait to try your model with my data. It's actually quite a huge dataset with documents of various layouts for which I would like to extract a set of key/value pairs.
I have a few questions though regarding the format of data for training:
some_field
's value consists of 4 words then you specify all the 4 words as the label.Is this the only format possible? I use Google Vision API to create text annotations and this results in word-level entities so my initial idea was to label my data on a word-level. Will this not work for PICK?
Another question relates to one of the sample files: https://github.com/wenwenyu/PICK-pytorch/blob/master/data/data_examples_root/boxes_and_transcripts/X00016469623.tsv
As far as I understand from the description, the first column is
id
, but why do all the values in the first column equal1
in that file?Thanks!