wayveai / mile

PyTorch code for the paper "Model-Based Imitation Learning for Urban Driving".
MIT License
365 stars 35 forks source link

Confused about the Observation Encoder #33

Open wzn0828 opened 1 year ago

wzn0828 commented 1 year ago

Hi,many thanks for your excellent work, Mile.

However, I'm very confused about the Observation Encoder:

In your paper, you described that the observation embedding x_t is the concatenation of the image feature (after pooling to BEV), route map feature and speed feature: x_t = [x_t' , r_t, m_t]. image

That's to say, the order is : pooling to BEV → mapping to a 1D vector → concat route map feature and speed feature

But, in the code, it seems that the order is reversed: concat route map feature and speed feature → pooling to BEV → mapping to a 1D vector. Like the codes in Mile.mile.models.mile.py: image

Very confused, or am I wrong? pls tell me...

anthonyhu commented 1 year ago

Good catch, I think in practice this is a technicality that does not matter much (concatenating before of after the BEV backbone).