cvlab-stonybrook / Scanpath_Prediction

Predicting Goal-directed Human Attention Using Inverse Reinforcement Learning (CVPR2020)
MIT License
97 stars 22 forks source link

General #2

Closed ghost closed 4 years ago

ghost commented 4 years ago

Why max_traj_length is set to 6? I am asking because i might have more than 30 text lines to predict their reading order per page.

ouyangzhibo commented 4 years ago

Why max_traj_length is set to 6? I am asking because i might have more than 30 text lines to predict their reading order per page.

6 is roughly the average scanpath length in COCO18, and larger than the length of most of the scanpath target-present trials in our dataset. Note that setting a max_traj_length is to prevent the generator from generating an infinitely long scanpath which hurts the training efficiency.

Increasing max_traj_length does not necessary reduce the performance, because in our case, the scanpath naturally stops when finding the target. I guess in your case, you probably need to design or learn some scanpath termination rule.

ghost commented 4 years ago

@ouyangzhibo post a small sample of the COCO-Search18 so that i can know the dataset structure, and test training. if you can't, then post how should my ground-truth be for training. what should be it's structure, etc...

ouyangzhibo commented 4 years ago

@ouyangzhibo post a small sample of the COCO-Search18 so that i can know the dataset structure, and test training. if you can't, then post how should my ground-truth be for training. what should be it's structure, etc...

Thanks for the question. I have updated README to include the data format description as well as a few samples of the dataset. Hope this helps!

ghost commented 4 years ago

Thanks