HighD preprocess script does not run as described

IndefiniteBen commented 6 months ago

The readme file says to put the data in the same folder as HighD_preprocess.m and then run this script. However, this script is a function that has a number of undocumented and required inputs.

This is the definition of the function in this file:

function [] = preprocess(path, historical_length, future_length, number_of_agents, max_vertical_distance, extra_feature_index)

As can be seen, the function name does not match the file name (which causes warnings) but the bigger problem is the number of inputs it requires that are not documented.

When running with insufficient inputs I get this error:

Not enough input arguments.

Error in HighD_preprocess (line 36)
    if ~isempty(extra_feature_index)

What do these inputs change? What are their units? What values should be used in the process of replicating the paper results?

IndefiniteBen commented 6 months ago

Based on the highD dataset init (from utils_highD.py):

def __init__(self, mat_file, t_h=30, t_f=50, d_s=2, enc_size=64, grid_size=(13, 3), n_lat=args['num_lat_classes'],
                 n_lon=args['num_lon_classes'], input_dim=args['input_dim'], polar=args['pooling'] == 'polar'):

I put the folder data containing the tracks CSVs in the dataset/HighD directory..

I will note the inputs in a table:

_preprocess.m	utils_highD	Value	Notes
path	-	'data`
historical_length	`t_h`	30	Number of samples before a tested point to include
future_length	`t_f`	50	Number of samples after a tested point
number_of_agents			Number of adjacent vehicles to select
max_vertical_distance	`d_s`???	2	Distance within which to select adjacent vehicles
extra_feature_index	N/A	26	column of tracks.csv to use as extra feature

I am not sure what the other values should be. I will update this comment if I figure out more info.

I have tried the following, which does not result in a crash at least:

HighD_preprocess('data', 30, 50, 10, 2, 26)

Edit: I found that max_vertical_distance sets the distance in each frame (read: timestep) within which adjacent vehicles are included.
Therefore, this returns a subset of the number of adjacent vehicles set by number_of_agents?

Petrichor625 commented 6 months ago

Thank you very much for your interest and recognition of our work. The research you referred to was conducted on a different server, and unfortunately, we are currently unable to access the most recent version of the code. Therefore, the code available here is an older version.

Fortunately, we have open-sourced the code for another one of our papers, which also includes the dataset. This work demonstrates improved results, and we believe it could be of great interest to you. (https://github.com/Petrichor625/HLTP)

Thank you once again for reaching out.

IndefiniteBen commented 6 months ago

I understand that you don't have access to the code, but can you not remember what these variables mean?
Are my notes correct and do these values look reasonable considering your memory?

Thank you for the link to the other repo, but the only mention of HighD in that repo is in the trajectory visualization and student evaluation code which seems to expect a .mat file. The repo refers stdan for dataset processing, but this stdan repo only has a processing script for NGSIM.

So I therefore have the same question: how do I process the HighD data into your .mat file format?

Is it also possible to train HLTP on the HighD dataset? Thank you

Petrichor625 commented 5 months ago

Sorry for the delay, I've been pretty busy lately and couldn't respond promptly.

Since the code and dataset don't match, I suggest you contact HLTP for replication. The HighD dataset requires an online application, and according to the official agreement, we can't provide it directly. You'll have to apply for it yourself, and the processing will be aligned with stdan, as detailed in HLTP's Issue and Readme sections. I've provided instructions there. We'll provide the correct code version in the future. For quick replication, please refer directly to HLTP's code; we've detailed it and provided weight files.

IndefiniteBen commented 5 months ago

I have applied for and received the HighD data. That was never the issue. I am in no way asking for you to share highD data.

I have seen that the HLTP readme mentions

The NGSIM and HighD datasets in our work are segmented in the same way as the work stdan

So I look at the stdan repo but I don't see any documentation or script for performing any kind of processing or segmentation. Can you link to the specific file or folder where it is documented?

Thank you!

Petrichor625 / BATraj-Behavior-aware-Model

HighD preprocess script does not run as described #5