HarshayuGirase / Human-Path-Prediction

State-of-the-art methods for human trajectory forecasting. Contains code for papers published at ECCV 2020 and ICCV 2021.
MIT License
349 stars 81 forks source link

Some questions #32

Closed ThierryDeruyttere closed 2 years ago

ThierryDeruyttere commented 2 years ago

Hi,

Thanks for the ynet code. We have a question about the gaussian heatmap template used to represent the ground truth template. In the paper, you mentioned that you represent the ground truth as a heatmap with gaussian components with a variance of 4 pixels centered at observed points. However, in the implementation, you are taking patches from the ground truth template which is essentially a large heatmap with a gaussian kernel with the size of 31. We don't think this has been mentioned in the paper so we were wondering what the reason for this discrepancy is.

ArcaneEmergence commented 2 years ago

Hi,

The template (called gt_template in the code) is a big heatmap (2*wider than the largest scene image) with a Gaussian with variance 4 fitted into the middle.

We initialize this template before training and load it into the GPU memory. During training we then take patches out of it to fit the ground-truth trajectory (so the trajectory point is at the mode of the heatmap). We just think this is more efficient, since the template just needs to be loaded once into the memory, instead of always initiating a new ground-truth heatmap.

Our implementation is that we first created a small 31x31 px heatmap with variance 4 and then basically zero-pad it to the final shape. This just has implementation reasons to ease some debugging at the beginning of this project. It would be a better coding style to directly initiate the final template with its final shape. The tiny difference in values (small values instead of zero outside 31px) should be neglectable for performance.