herobd / dessurt

Official implementation for Dessurt
MIT License
56 stars 8 forks source link

Fine-Tunining on QA with Bounding Boxes #11

Open furkanpala opened 1 year ago

furkanpala commented 1 year ago

Hi,

Thank you for making this valuable project publicly accessible. I am trying to fine-tune the Dessurt on a receipt-like documents on the natural_q~ task. I would like to feed bounding boxes for each question and answer. However, I could not understand the format for bounding boxes. It looks like each bbox has 16 values by looking at the crop_transform.py. I understand the first 8 values repesent the coordinates for 4 corners. Can you explain what are the next 8 used for? Is it like one bbox with 8 values for question and one bbox with the next 8 values for answer? If not, can you also explain how am I supposed to feed bbox for question and answer separately?

Thanks for your time and effort.

herobd commented 1 year ago

These are the midpoints of each line of the bounding box (the same bbox as the previous corner points). They should be automatically derived from the annotations and are just there to help in the cropping.