hysts / pytorch_mpiigaze

An unofficial PyTorch implementation of MPIIGaze and MPIIFaceGaze
MIT License
346 stars 85 forks source link

Transformation from angles to vectors #48

Closed ShiweiJin closed 3 years ago

ShiweiJin commented 3 years ago

Hi,

Thank you so much for your work. It is beneficial for us to learn the gaze estimation area. I have a question about the transformation from pitch and yaw to 3D gaze vector. Why do we need to assign a minus sign to each element of the gaze vector? https://github.com/hysts/pytorch_mpiigaze/blob/3077444ef597ef2224816d4b6fd8ca54c0ea10c8/gaze_estimation/utils.py#L68 I initially thought they are in different coordinate systems. But actually, they are both in the camera coordinate system. I am very confused about this step. Could you please help me with this question? Thank you so much for your help.

Sincerely, Shiwei

hysts commented 3 years ago

Hi, @ShiweiJin

Hmm, I'm also confused. I don't quite remember why I implemented it that way, but I guess I followed the description in the Q&A section of the project page of the paper.

  1. How do you convert 3d directional vector to 2d angle?

We refer to the paper [3] for the data normalization.

Briefly to say, the 3D gaze direction (x, y, z) can be converted to 2D representation (theta, phi) like:

theta = asin(-y) phi = atan2(-x, -z)

But after all, the function is only used here, and in this case, the sign of the vector doesn't matter, so I think we don't need to worry too much about it.

ShiweiJin commented 3 years ago

Hi, @hysts

Thank you very much for sharing the link and also the Q&A part. It is very helpful.

The website mentsions:

The negative representation has been used so that camera-looking direction becomes (0,0).

I think the negative representation comes from the sequence of the eye, the target and the camera. If we use the eye coordinate system, the eye's location should be (0, 0). And the gaze direction = gazeing target - eye. However, if we would like to transform it to the camera coordiante system, there should be at least one axis having opposite direction since eye-looking direction and camera-looking direction are opposite. I think that is the reason why the negative sign was assigned. Do you think this makes sense?

hysts commented 3 years ago

@ShiweiJin

Ah, I see. Yes, that totally makes sense. Thank you. :)

ShiweiJin commented 3 years ago

No problem. Thank you for sharing the code. It is very helpful. And also thank you for giving me the tips and link.