However, I'm very confused about the Observation Encoder:
In your paper, you described that the observation embedding x_t is the concatenation of the image feature (after pooling to BEV), route map feature and speed feature: x_t = [x_t' , r_t, m_t].
That's to say, the order is : pooling to BEV → mapping to a 1D vector → concat route map feature and speed feature
But, in the code, it seems that the order is reversed: concat route map feature and speed feature → pooling to BEV → mapping to a 1D vector. Like the codes in Mile.mile.models.mile.py:
Hi,many thanks for your excellent work, Mile.
However, I'm very confused about the Observation Encoder:
In your paper, you described that the observation embedding x_t is the concatenation of the image feature (after pooling to BEV), route map feature and speed feature: x_t = [x_t' , r_t, m_t].
That's to say, the order is : pooling to BEV → mapping to a 1D vector → concat route map feature and speed feature
But, in the code, it seems that the order is reversed: concat route map feature and speed feature → pooling to BEV → mapping to a 1D vector. Like the codes in Mile.mile.models.mile.py:
Very confused, or am I wrong? pls tell me...