Closed Kin-Zhang closed 2 years ago
For the training routes, we just use the list of waypoints generated by the function world.get_map().generate_waypoints(2)
and randomly choose start point and end point from it and project them to road using world_map.get_waypoint(start_wp.transform.location,project_to_road=True, lane_type=carla.LaneType.Driving)
. Then we interpolate the trajectory defined by the start point and end point to calculate the total distance of the trajectory defined by them, and only keep those whose length are between 50 to 300m.
Thank you for your advice, we will add network details in the further version. For the network, the measurement encoder is implemented using just one MLP with the concatenation of inputs. The measurement encoder encodes the input to a vector of dim 128. And all the $\rm{\bf{j}}$ vectors has dim 256. The input is only one image. The future control predictions are performed with the abstractions of the future world states estimated by the temporal module.
Thanks for replying. Sorry to bother you again about the detail... maybe it will open source in the future but it makes people confused while reading the paper before they can read your codes.
Here are other questions about the temporal module, in figure 3 the input the $\bf h{t-1}^{traj}, \bf h{t-1}^{ctl}$
what's the detail about the blue and orange block of these two networks, is that so an MLP? The detail about one $\bf wp, \bf a_0$ is {x,y} and {steer, control, throttle} respectively, is that right?
According to the paper, is that mean $K$ is also the number of GRU modules in the Trajectory branch?
About the feature loss on $\bf j$, in the [55] Roach, it has the gt BEV that's why the feature can be one loss, but in your paper, the TCP just receive the image, how the expert feature can have such comparison with traj as they all have image input?
- for this question, is that mean that your paper also has the BEV gt as input for expert, but not the raw image for input in the expert?
Yes, we use the same expert in roach as our expert.
Thanks for the explainations.
routes for training
It said that it collected 8 towns' data, but in the CARLA leaderboard, it just provided the six towns' routes XML. Paper said, "it is generated randomly with length ranging from 50 meters to 300 meters."
Could you please give the route files that the data collected or the random scripts? I think the training data sometimes is more important to the model... just sometimes. I'm just curious about the exact detail of the data routes. Since there is no table about others' methods but trained on your dataset in the paper also.
network
The detail about the measurement encoder didn't illustrate in the paper, is that just one MLP like concat[v, turn_left, (x,y)] -> to the desired size? maybe adding the network details about the output, and input size will help readers know better about the network detail. Or what's the exact output size on $\mathbf F, \bf j_m$ which formed the j^{traj}
the input about the whole network is just one image or $K=4$ nums of images?