@erkil1452 hello, Recently, I'm viewing your code, there are some doubts, which are lined with red:
I don't know why the predicted value need use nn.Tanh, and one mutiply math.pi, while another mutiply math.pi/2;
I also don't know what's mean of [:,3,:]
Tanh(x) maps the network output to range -1, 1 and multiplication by pi maps it to range [-pi,pi]. That is for yaw (360 deg range). For pitch pi/2 is enough (+/- 90 deg).
x[:,3,:] slices the 4th element (index = 3) along the 2nd axis (1st and 3rd axis are not sliced). In this example the network produces output for all 7 frames and we select the middle one. (Note that our network runs over 7 frame video sequences).
@erkil1452 hello, Recently, I'm viewing your code, there are some doubts, which are lined with red:
I don't know why the predicted value need use nn.Tanh, and one mutiply math.pi, while another mutiply math.pi/2; I also don't know what's mean of [:,3,:]