duckietown / gym-duckietown

Self-driving car simulator for the Duckietown universe
http://duckietown.org
Other
51 stars 19 forks source link

Multiple curves and viz #80

Closed bhairavmehta95 closed 6 years ago

bhairavmehta95 commented 6 years ago

My other PR was from a stale branch, so reopened.

Adding rewards based on the "closest" curve, as well as a visualization tool to see which curve the agent is actually picking (you can see this by adding the --draw-curve cmdline parameter).

screenshot from 2018-08-19 20-52-59

Took me a while to get right, since the points defining the curves themselves have to be specified in the "road-legal" way.

All the curves switch relatively well, with the exception of the extremely-tight right-hand turns at 3way and 4way intersections. Printing out the "scores" (dot product between the heading vector of the agent and a vector from the first to last point of the curve), shows that its registering correctly, but the "straight" curve at these intersections almost always has a bigger dot product. Regardless, pretty positive it won't be a problem.

From here, we can ideally get NPC Duckiebots on the road (without logic integrated) relatively soon -- we can use @maximecb 's PID controller to follow the curve, and pick valid curves randomly at intersections / tile changes.

maximecb commented 6 years ago

Good work on precalculating the rotated and translated curve points.

I'm not quite sure about this though:

        curve_headings = curves[:, -1, :] - curves[:, 0, :]
        curve_headings = curve_headings / np.linalg.norm(curve_headings).reshape(1, -1)

Is this just subtracting control points as an approximation of the curve's direction? I think we probably should be computing the closest curve point for each curve, and computing the tangent at that point. More complex, but more accurate. What do you think?

maximecb commented 6 years ago

Regarding NPC bots, let's keep this for another PR. I agree that we could use a PID controller for them. Avoiding obstacles will indeed be tricky. We could potentially sample points ahead of the agents and check if the points are inside an obstacle, and shift the trajectory to follow appropriately. What would be ideal is to reuse the same logic to generate expert demonstrations as we use for driving NPCs.

bhairavmehta95 commented 6 years ago

Yes it is, it's a vector that points from the first to the last control point; it seems to work well enough via manual testing, but I think your method may be more airtight (although, I believe that the way we should start trbut yeah we can do what you're saying. I have thought about how we might go to the avoiding obstacles part, but I think in person may be a better way to initially explain it.

maximecb commented 6 years ago

Not sure if we're having a simulator meeting tomorrow, but we could tentatively discuss this at 2PM either way.