Open IndefiniteBen opened 6 months ago
Based on the highD dataset init (from utils_highD.py):
def __init__(self, mat_file, t_h=30, t_f=50, d_s=2, enc_size=64, grid_size=(13, 3), n_lat=args['num_lat_classes'],
n_lon=args['num_lon_classes'], input_dim=args['input_dim'], polar=args['pooling'] == 'polar'):
I put the folder data
containing the tracks CSVs in the dataset/HighD
directory..
I will note the inputs in a table:
_preprocess.m | utils_highD | Value | Notes |
---|---|---|---|
path | - | 'data` | |
historical_length | t_h |
30 | Number of samples before a tested point to include |
future_length | t_f |
50 | Number of samples after a tested point |
number_of_agents | Number of adjacent vehicles to select | ||
max_vertical_distance | d_s ??? |
2 | Distance within which to select adjacent vehicles |
extra_feature_index | N/A | 26 | column of tracks.csv to use as extra feature |
I am not sure what the other values should be. I will update this comment if I figure out more info.
I have tried the following, which does not result in a crash at least:
HighD_preprocess('data', 30, 50, 10, 2, 26)
Edit: I found that max_vertical_distance
sets the distance in each frame (read: timestep) within which adjacent vehicles are included.
Therefore, this returns a subset of the number of adjacent vehicles set by number_of_agents
?
Thank you very much for your interest and recognition of our work. The research you referred to was conducted on a different server, and unfortunately, we are currently unable to access the most recent version of the code. Therefore, the code available here is an older version.
Fortunately, we have open-sourced the code for another one of our papers, which also includes the dataset. This work demonstrates improved results, and we believe it could be of great interest to you. (https://github.com/Petrichor625/HLTP)
Thank you once again for reaching out.
I understand that you don't have access to the code, but can you not remember what these variables mean?
Are my notes correct and do these values look reasonable considering your memory?
Thank you for the link to the other repo, but the only mention of HighD in that repo is in the trajectory visualization and student evaluation code which seems to expect a .mat
file. The repo refers stdan for dataset processing, but this stdan repo only has a processing script for NGSIM.
So I therefore have the same question: how do I process the HighD data into your .mat
file format?
Is it also possible to train HLTP on the HighD dataset? Thank you
Sorry for the delay, I've been pretty busy lately and couldn't respond promptly.
Since the code and dataset don't match, I suggest you contact HLTP for replication. The HighD dataset requires an online application, and according to the official agreement, we can't provide it directly. You'll have to apply for it yourself, and the processing will be aligned with stdan, as detailed in HLTP's Issue and Readme sections. I've provided instructions there. We'll provide the correct code version in the future. For quick replication, please refer directly to HLTP's code; we've detailed it and provided weight files.
I have applied for and received the HighD data. That was never the issue. I am in no way asking for you to share highD data.
I have seen that the HLTP readme mentions
The NGSIM and HighD datasets in our work are segmented in the same way as the work stdan
So I look at the stdan repo but I don't see any documentation or script for performing any kind of processing or segmentation. Can you link to the specific file or folder where it is documented?
Thank you!
The readme file says to put the data in the same folder as
HighD_preprocess.m
and then run this script. However, this script is a function that has a number of undocumented and required inputs.This is the definition of the function in this file:
As can be seen, the function name does not match the file name (which causes warnings) but the bigger problem is the number of inputs it requires that are not documented.
When running with insufficient inputs I get this error:
What do these inputs change? What are their units? What values should be used in the process of replicating the paper results?