simensov / ml4ca

Code base for "Dynamic Positioning using Deep Reinforcement Learning". Paper: https://www.sciencedirect.com/science/article/pii/S0029801821008398 - Thesis: https://ntnuopen.ntnu.no/ntnu-xmlui/handle/11250/2731248
13 stars 4 forks source link

reset #20

Open waynezw0618 opened 2 years ago

waynezw0618 commented 2 years ago

Hello again @simensov I see for reset you randomly select the value for the states for both pose and velocity of the bounds. https://github.com/simensov/ml4ca/blob/e2e75f3785455a29faa92881870605590e07425f/src/rl/windows_workspace/specific/customEnv.py#L141-L150. it is easy to understand that you have a random initial velocities. but I don't understand what is the meaning of random pose, is it in the error frame or in global frame. I mean how to set the initial points for each episode in the training. do I need take the random pose into the consideration. since I suppose for each episode you randomly peak up a point some distance and angle away from the set point. do you need to add the random pose to that point?

Best Regards Wei

simensov commented 2 years ago

Hi,

The selection of randomized starting pose (position and heading) was done in the NED-frame (or the "global frame"). But I simply used the same point in the NED-frame as set point for DP in each episode, meaning that if I chose random pose in the NED-frame, it also meant choosing random pose in the error frame.

waynezw0618 commented 2 years ago

Hi I may not well understand yet. if you select a random starting pose, then what is the reference for the start point? if it is the same, the state in error frame is zeros for all. is it ?

simensov commented 2 years ago

The setpoint in error-frame is always [0, 0, 0] -> eliminate all errors. This was set to a fixed point in the NED-frame, native to the simulator that I used from DNV. I believe the coordinates were along the lines of 63°26' N, 10°25' E, and the heading was towards true north (0 deg) - that is in the Dorabassenget bay area in Trondheim, Norway. So the real, random pose was selected to be a deviation from this NED-frame pose, limited by the size of the elements in self.real_ss_bounds

waynezw0618 commented 2 years ago
simensov commented 2 years ago