wenwenyu / PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
https://arxiv.org/abs/2004.07464
MIT License
556 stars 193 forks source link

How to predict box-level tag? #25

Closed tengerye closed 4 years ago

tengerye commented 4 years ago

Hi, @wenwenyu, is your code possible to output the box-level tags instead of entity_name-text pair please?

tengerye commented 4 years ago

In your test.py, there is a snippet:

logits = output['logits']  # (B, N*T, out_dim)
new_mask = output['new_mask']
image_indexs = input_data_item['image_indexs']  # (B,)
text_segments = input_data_item['text_segments']  # (B, num_boxes, T)
mask = input_data_item['mask']
# List[(List[int], torch.Tensor)]
best_paths = pick_model.decoder.crf_layer.viterbi_tags(logits, mask=new_mask, logits_batch_first=True)

I have two questions about it:

  1. Does the logits have the same length of total characters in the tsv file?
  2. Are the characters of logits in the same order of boxes in tsv file?

Look forward to your kind reply.

wenwenyu commented 4 years ago

What is the mean of predicting a box-level tag? I don't understand this question. Our method only predicts the BIO tag.

  1. No. The max length of the logits is documents.MAX_BOXES_NUM * documents.MAX_TRANSCRIPT_LEN written in data_utils/documents.py file. Exceeding max length will be truncated.

  2. No. The original of boxes will be sorted from the default order of top-down and left-right according to coordinates.