How to predict box-level tag?

tengerye commented 4 years ago

Hi, @wenwenyu, is your code possible to output the box-level tags instead of entity_name-text pair please?

tengerye commented 4 years ago

In your test.py, there is a snippet:

logits = output['logits']  # (B, N*T, out_dim)
new_mask = output['new_mask']
image_indexs = input_data_item['image_indexs']  # (B,)
text_segments = input_data_item['text_segments']  # (B, num_boxes, T)
mask = input_data_item['mask']
# List[(List[int], torch.Tensor)]
best_paths = pick_model.decoder.crf_layer.viterbi_tags(logits, mask=new_mask, logits_batch_first=True)

I have two questions about it:

Does the logits have the same length of total characters in the tsv file?
Are the characters of logits in the same order of boxes in tsv file?

Look forward to your kind reply.

wenwenyu commented 4 years ago

What is the mean of predicting a box-level tag? I don't understand this question. Our method only predicts the BIO tag.

No. The max length of the logits is documents.MAX_BOXES_NUM * documents.MAX_TRANSCRIPT_LEN written in data_utils/documents.py file. Exceeding max length will be truncated.
No. The original of boxes will be sorted from the default order of top-down and left-right according to coordinates.

wenwenyu / PICK-pytorch

How to predict box-level tag? #25