Setting these suggestions as rules could make training and predicting more accurate and focused:
number of fixations = number of bounding boxes in the image.
each bounding box must always have 1 fixation only.
Take this example:
In a single image, you already have the bounding box of each object predicted using detectron, and now all you need is to predict the order of those objects.
You only need 1 fixation point in a bounding box to tell that it's next in the order.
Those suggestions will also solve the issue of having multiple fixations on a single object.
@ouyangzhibo Thank you for your hard work,
Setting these suggestions as rules could make training and predicting more accurate and focused:
Take this example:
In a single image, you already have the bounding box of each object predicted using detectron, and now all you need is to predict the order of those objects.
You only need 1 fixation point in a bounding box to tell that it's next in the order.
Those suggestions will also solve the issue of having multiple fixations on a single object.