Looking to Relations for Future Trajectory Forecast

Fujiry0 commented 4 years ago

Paper: http://openaccess.thecvf.com/content_ICCV_2019/papers/Choi_Looking_to_Relations_for_Future_Trajectory_Forecast_ICCV_2019_paper.pdf

Summary: Predict future trajectories of all objects (given past top-view image streams and past trajectories) as heatmaps, while taking into account interactions between objects.

Comment: The paper seems to focus more on explaining complicated architecture, so might need to concentrate to follow the description.

Details

Input: Top view image sequences and previous trajectories of all objects
Output: Multiple heatmaps, each encoding future position of each object at a fixed future time (multiple maps for different times)
Proposed a feature encoding module inspired by LSTM (I could be wrong but I'd say it's like an attention mechanism).
- Comment: The paper seems to focus on explaining architecture
Computing feature for each object (F^K) is O(n²), so all features for all objects need O(n³).
Use Monte Carlo Dropout to make the network probabilistic and computing uncertainty (variance) by sampling 5 samples during inference

Main architecture:

GRE: Gated Relation Encoder - encode input TPN: Trajectory Prediction Network - predict heatmaps of future location SRN: Spatial Refinement Network - make the heatmaps of each object more coherent thru time

Github: None

jayakornv commented 4 years ago

This work use multiple 2D heatmaps, where each is associated with one future time instance. To make the heatmaps between different times coherent, they designed a network structure to capture the dependency. Still, this does not provide real probabilistic dependency.

Fujiry0 commented 4 years ago

I can understand the way to get uncertainty, use Monte Carlo Dropout and get the variance of 5 trajectories, but I am not sure the way to visualize the pixel wise uncertainty. (2nd row)

Screenshot from 2020-06-08 12-19-15

jayakornv commented 4 years ago

I'm also not sure how they plot the confidence... it's very hard to see actually, so i can't make a good guess...

ryohachiuma commented 4 years ago

As far as I know, if the model predicts with mc dropout, the mean and variance can be calculated for each pixel in this case. Therefore, they just visualize the variance of output for each pixel. The final trajectory can be given from the averaged heatmap. It's like Bayesian Segnet. https://arxiv.org/pdf/1511.02680.pdf http://proceedings.mlr.press/v48/gal16.pdf

jayakornv commented 4 years ago

@kemangjaka I also thought they would plot a per-pixel variance, but it is pretty strange once I think about it. This is because first they sample 5 paths, where I think each path could be considered 0/1 on each pixel that it passes thru. If this is the case, then basically they consider each pixel as a binomial distribution with 5 trials, and the mean and variance obtained are statistics of pixel-wised binomial distribution. So, the uncertain associated is uncertainty of each pixel being occupied, but not the uncertainty of possible path locations.

ryohachiuma commented 4 years ago

@jayakornv I thought that the pixel of the predicted heatmap represents the possible future person's position (location) in a probabilistic manner (just ranged from 0 to 1), and by taking an argument max of heatmap at each timestep, the trajectory can be obtained.

And by using MC dropout five times, the five heatmaps can be predicted for each timestep (H^1~H^5). At the timestep t, we can compute the mean and variance of heatmaps.

mij = sum{c=1,.,,5}(H^c_ij) varij = var{c=1.,,,5}(H^c_ij) where ij denotes the pixel index.

And the the future position p^t can be computed by p^t = argmax_{ij}(m)..

So, I thought the uncertainty is the uncertainty against the person positions.

sorry, maybe I didn't understand fully.

jayakornv commented 4 years ago

I think I understand the same way as you... I think they actually plot the uncertainty using the way you mentioned. The problem is with only 5 paths, it's unlikely that there will be single max in heatmap for the predicted position (say for the position at T_max in the future). I believe they find the mean path by simply finding the mean of the position

p^t = \mean (p_1^t, p_2^t, ..., p_5^t).

Of course, the paper summarize all this procedure in a single, vague sentence We compute the variance ofL= 5samples to measurethe uncertainty (second row) and their mean to output fu-ture trajectory (third row). So it is pretty hard to actually guess what they did.

Fujiry0 / Trajectory-Prediction-Survery

Looking to Relations for Future Trajectory Forecast #6