Petrichor625 / HLTP

[IEEE TIV] Official PyTorch Implementation of ''A Cognitive-Based Trajectory Prediction Approach for Autonomous Driving.''
42 stars 3 forks source link

Is the code in the wrong version for the papper? #7

Open Night-to-sleep opened 1 month ago

Night-to-sleep commented 1 month ago

Hello author, the model structure and training parameters in the code are inconsistent with those in the paper. The student model deteriorates as it is trained, and it cannot achieve the effect described in the paper,can you public the correct parameters?

Gavin-Tao commented 1 week ago

I think the author's silence and only existing readme files in this work and other works have already answered our questions.

Petrichor625 commented 1 week ago

Thank you for your patience and for following up with us. I understand your concerns, and I’m glad to let you know that we’ve now uploaded a fully updated version of the code, including all model parameters and weight files needed to replicate our results.

We genuinely appreciate your interest in our work and your understanding as we worked to make these resources available. If you have any further questions or need assistance with the new version, please don’t hesitate to reach out. We’re here to support you and ensure you have everything needed to explore and build upon our work.

Thank you again for your engagement and patience!

Night-to-sleep commented 1 week ago

Thank you very much for the author's reply, but I found that there are still many problems in the code. Firstly, there are errors in your graph Convolutional network code, and secondly, the model given does not use the shift window module. The article says that 1.5 seconds of data is used to predict future trajectories, but the 1.5 seconds you have here is sampled from 3 seconds.

Night-to-sleep commented 1 week ago

The code that calls this function has been commented out, the structure of the model is changed after uncommented, and the trained model weights given should not be available.

FranklinNotOld commented 1 week ago

Thank you very much for the author's reply, but I found that there are still many problems in the code. Firstly, there are errors in your graph Convolutional network code, and secondly, the model given does not use the shift window module. The article says that 1.5 seconds of data is used to predict future trajectories, but the 1.5 seconds you have here is sampled from 3 seconds.

Hi, the shift window module is implemented in the 'def cutout(tensor, x, y, i, j, mode)' and 'def cutout_attention(tensor1, tensor2, x, y, corresponding=True, stride_x=8, stride_y=6)'.

Night-to-sleep commented 1 week ago

I see code given in this module, but this part of the code is not used

FranklinNotOld commented 1 week ago

The code that calls this function has been commented out, the structure of the model is changed after uncommented, and the trained model weights given should not be available.

About the model weight, we will check and if something is wrong, we'll update it later.

Petrichor625 commented 1 week ago

@Night-to-sleep The issue you’re encountering (Convolutional network code) may be due to some environment-related configuration challenges. We’ve uploaded an environment setup overview file in the main folder to help guide you through the necessary configurations.

Petrichor625 commented 1 week ago

Thank you very much for the author's reply, but I found that there are still many problems in the code. Firstly, there are errors in your graph Convolutional network code, and secondly, the model given does not use the shift window module. The article says that 1.5 seconds of data is used to predict future trajectories, but the 1.5 seconds you have here is sampled from 3 seconds.

The data you’re referring to is from an experiment mentioned in our paper for the data-missing scenes. We created two types of reduced datasets to simulate missing observations. One method involves shortening the observation period, while the other reduces the sampling rate. What you're seeing is one of these approaches in action.

FranklinNotOld commented 1 week ago

I see code given in this module, but this part of the code is not used

The code related to shift window attention is already annotated in the source code:

shift window attention

    # a = cutout_attention(query, keys, 32, 24)
    # a = self.weights(a)
    # a /= math.sqrt(self.encoder_size)
    # a = torch.squeeze(a, dim=-1)
    # a = torch.softmax(a, -1)
    # values = torch.tanh(values)
    # values = torch.matmul(a, values)

You can uncomment it and comment out the code for the original attention mechanisms.

Night-to-sleep commented 1 week ago

I'm sorry I didn't make myself clear. What I want to say is that in the code, the author's idea is to merge the graph structure of each sample into a large graph structure in a batch and then input it into the network, but the corresponding edge index does not change in the process of structure change, which leads to the implementation method in the code is wrong. The code only expands the edge index variables, but when the graph structure is merged, the order of each node changes. The position of the merged node is 1000, but your edge index is only 39, which is a very fatal error.

FranklinNotOld commented 1 week ago

I'm sorry I didn't make myself clear. What I want to say is that in the code, the author's idea is to merge the graph structure of each sample into a large graph structure in a batch and then input it into the network, but the corresponding edge index does not change in the process of structure change, which leads to the implementation method in the code is wrong. The code only expands the edge index variables, but when the graph structure is merged, the order of each node changes. The position of the merged node is 1000, but your edge index is only 39, which is a very fatal error.

This is a relatively complex and nuanced process, and it’s hard to explain clearly in just a few sentences. In a word, the structure may appear to be disrupted, but it is actually arranged in a specific order, allowing for this order to be restored in practice through reversing. You may also refer to the paper and code from 'Convolutional Social Pooling for Vehicle Trajectory Prediction'.

Petrichor625 commented 1 week ago

This is a relatively complex and nuanced process, and it’s hard to explain clearly in just a few sentences. In a word, the structure may appear to be disrupted, but it is actually arranged in a specific order, allowing for this order to be restored in practice through reversing. You may also refer to the paper and code from 'Convolutional Social Pooling for Vehicle Trajectory Prediction'.

As mentioned, this approach is indeed nuanced and based on a specific ordering that allows the structure to be restored through reversing. In our initial work, we discussed this aspect in depth, but we welcome any alternative ideas you might have. If you’re able to explore a different method and achieve improved results experimentally, that would be fantastic!

Please feel free to experiment, and we’d love to hear about any findings or insights you gain. Thank you again for your interest and engagement with our work!

Night-to-sleep commented 1 week ago

I don't mean that this structure is broken and cannot be restored. The number of graph nodes entered by the code is very large, far more than 10,000, but the maximum value of the edge index is only 39, which means that the node features after 39 are not used in the graph convolution at all.

FranklinNotOld commented 1 week ago

I don't mean that this structure is broken and cannot be restored. The number of graph nodes entered by the code is very large, far more than 10,000, but the maximum value of the edge index is only 39, which means that the node features after 39 are not used in the graph convolution at all.

It's worth to note that the 10000 is divided by 39, not deleted after the 39. You can try to output the dimensions of the vector.

Night-to-sleep commented 1 week ago

Let's say there are 10,000 nodes, and each node has five features. Then the node matrix is (10,000, 5), where 200 nodes are connected to each other, then the edge matrix is (2,200). The edge matrix holds the index of the connected nodes. Now the maximum index is 39, which means that the nodes after the 39th node are not connected to each other and are not used in graph convolution.

Night-to-sleep commented 1 week ago

At present, the number of nodes in the paper is not more than 10,000, and after the 39th node, the nodes are not connected to each other?