wayveai / mile

PyTorch code for the paper "Model-Based Imitation Learning for Urban Driving".
MIT License
330 stars 31 forks source link

Question about Training sequence structure #37

Open Shar-01 opened 6 months ago

Shar-01 commented 6 months ago

Thanks for the excellent work!

I wanted to confirm my understanding of the sequence structure.

  1. In the paper, it is mentioned that a sequence of length T=12 is used for training. If I understand correctly, this 12-timestep sequence (let's say, T_1 to T_12) is one training example? Within this 12-timestep sequence, the first 11 are observed, and the last 1 is imagined? And this 12-timestep chain is trained like an RNN as depicted in Figure 1 of the paper, is this correct? Accordingly, the next training sample would be T_2 to T_13, which would again be trained in the same recurrent manner, right?
  2. If the above holds, does the supervision take place at every step within one training sequence (training example), like a many-to-many sequence RNN? I.e., within the single training sequence T_1 to T_12, at T_1 the predicted output y1_hat is matched with the ground truth, y1 similarly, at T_2, the predicted y2_hat is matched with its corresponding ground truth, y2. Is that correct?

Thanks a lot!

anthonyhu commented 5 months ago

Thank you for your kind words!

To answer your questions:

  1. Yes that's correct, [T_1, ..., T_12] would be the first valid sequence of a driving run. The second sequence would be [T2, ..., T_13] etc until we reach the end of the driving run. During training however, we randomly sample training sequences from the entire dataset, so any training sequence from any driving run can be randomly selected (a batch will not necessarily contain two consequence sequences - it is even highly unlikely). 2.That's correct.