abduallahmohamed / Social-STGCNN

Code for "Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction" CVPR 2020
MIT License
483 stars 141 forks source link

max_nodes =88 #35

Closed arsalhuda24 closed 3 years ago

arsalhuda24 commented 3 years ago

very interesting work I have a question about the seq_to_nodes function. why is max_nodes set to be 88? I was wondering how do you compute this number? Thanks

def seq_tonodes(seq,maxnodes = 88): seq = seq_.squeeze() seqlen = seq.shape[2]

V = np.zeros((seq_len,max_nodes,2))
for s in range(seq_len):
    step_ = seq_[:,:,s]
    for h in range(len(step_)): 
        V[s,h,:] = step_[h]

return V.squeeze()
abduallahmohamed commented 3 years ago

Hi, this number was used in early stages of the dev of the model. The max number of pedestrians across the datasest in a single scene is 88. In the early trials as far as I recall I wanted to have a place holder like zero and a fixed graph dimension. Also, this number doesn't effect the evaluation code. I pushed a fix to remove this and to have a more optimized eval metrics.

arsalhuda24 commented 3 years ago

Thanks for clarification. So I used your code to train the model on Lyft object detection dataset but it seems the predictions are very weird. I am using pixel coordinates though. Do you have idea what could be possible reasons. I tried using different obs and pred lengths also but no luck. Social-STGCNN_lyft blue : observed green: ground truth red: prediction (one out of 20 sample)

abduallahmohamed commented 3 years ago

Hi, The dataset I used is in meters not in pixels. Also, I’m not aware of the Lyft dataset but you need to consider that the adjacency matrix kernel function is only for pedestrians, it might not work for other classes.

On Thu, Oct 29, 2020 at 8:59 AM Arsal Syed notifications@github.com wrote:

Thanks for clarification. So I used your code to train the model on Lyft object detection dataset but it seems the predictions are very weird. I am using pixel coordinates though. Do you have idea what could be possible reasons. I tried using different obs and pred lengths also but no luck. [image: Social-STGCNN_lyft] https://user-images.githubusercontent.com/20391303/97583159-dc40b280-19b3-11eb-8dd1-d6c80285e6ee.png

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/abduallahmohamed/Social-STGCNN/issues/35#issuecomment-718771477, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIVLGZQPQIXTXAGNWM5IUTSNFYMZANCNFSM4S5SIIDQ .

--

Abduallah Mohamed abduallahmohamed.com

arsalhuda24 commented 3 years ago

Thanks Abduallah. Actually I have prepared the Lyft data in exactly similar way to UCY/ETH datasets the only difference is that its in pixel coordinates. So I was wondering , shouldn't the adjacency matrix kernel construct the graph in a similar way to that of pedestrians. or is it that pedestrians have much more stochastic behavior compared to vehicles and thats why may be it can not capture accurate V2V interactions. I don't know may be i am missing something.

I also tried changing the kernel function as mentioned by equation 8 and 9 in paper but no luck.

Thanks

abduallahmohamed commented 3 years ago

I'm not sure about pixel coordinates vs meters. But I don't think the current adjacency matrix design is suitable for cars, because you think differently about cars around you unlike pedestrians. So, having a suitable kernel function can help a lot.

Thanks

arsalhuda24 commented 3 years ago

Thanks, I think it makes sense to change the adjacency matrix because when I use eq8 and 9 the results are very much different. Need to think about how to find a good kernel function.

Just curious in pedestrian case your work constructs graph between every pedestrian in the scene or there is a fixed neighborhood? What I am understanding is that the graph is dynamic as the frames progress?

Do you think having a fixed frame from graph construction might help in case of vehicles?

abduallahmohamed commented 3 years ago

-It constructs a graph between every pedestrian in the scene, the graph is indeed dynamic per observation scene. Aka, observed scenes should have the same number of pedestrians but not necessary all the scenes are the same.

On Tue, Nov 3, 2020 at 3:15 AM Arsal Syed notifications@github.com wrote:

Thanks, I think it makes sense to change the adjacency matrix because when I use eq8 and 9 the results are very much different. Need to think about how to find a good kernel function.

Just curious in pedestrian case your work constructs graph between every pedestrian in the scene or there is a fixed neighborhood? What I am understanding is that the graph is dynamic as the frames progress?

Do you think having a fixed frame from graph construction might help in case of vehicles?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/abduallahmohamed/Social-STGCNN/issues/35#issuecomment-720995768, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIVLG3ACR72NT7DUJBLA4LSN7C4LANCNFSM4S5SIIDQ .