JeffTheHacker / ContingenciesFromObservations

Other
34 stars 5 forks source link

Questions regarding Dataset Creation, Model Selection and Evaluation Protocol #6

Open diehlornodeal opened 1 year ago

diehlornodeal commented 1 year ago

Hi,

First of all, thank you for open-sourcing your code and the excellent work. I want to train and evaluate my algorithms on your interactive benchmark. I need slightly different input modalities such that I have to collect a data set on my own. For a fair comparison, I would like to construct a dataset close to yours and perform the same evaluation. However, I have some questions:

  1. In the paper, you write: “... we identified 4 suitable locations, and used 1 of them as training locations and 3 as test locations.” Could you specify which location is used for training and which is used for testing?

  2. Each location has five different behavior types, which would result in 5 different episodes, as there is no variation/randomness besides the behavior type, is it? However, the uploaded dataset (https://drive.google.com/file/d/14-o8XZtqJnRRCPqX3gz-LJuOgBORcbXT/view) contains more episodes. For example, the folder Dataset/Left Turn/train contains 200 Episodes. Could you please explain how these episodes differ? Especially from my point of view, episodes {1-5} are the same as {6-10}, aren’t they? Or do the sets of episodes ({1,…,5}, {6,…,10}, and so on) differ in terms of perturbation of the historical trajectories? Did you modify the seed parameter during data generation for training? (https://github.com/JeffTheHacker/ContingenciesFromObservations/blob/82c41a9f467b05cf27e87410e488887812207de4/Experiment/Utils/prepare_data.py#L119)

  3. How many hours of driving data did you collect for training for each scenario?

  4. Did you use the commented lines 97 - 109 of: https://github.com/JeffTheHacker/ContingenciesFromObservations/blob/82c41a9f467b05cf27e87410e488887812207de4/Experiment/Utils/prepare_data.py#L98 If so what was your intuition here?

  5. How did you perform model selection? Did you use an open-loop validation metric for model selection? Did you use a closed-loop control validation scenario? Did you save different checkpoints of your model, evaluate them on the test scenario, and report the best model result in your paper? Otherwise, could you specify your model selection process?

Thank you in advance for bringing clarity :)

Best, Chris