AIRLegend / aitrack

6DoF Head tracking software
MIT License
1.03k stars 102 forks source link

need wider range of model type to reduce jitter #145

Closed searching46dof closed 1 year ago

searching46dof commented 2 years ago

Describe your idea I'm using the head tracking for DCS where looking at the edges of the tv +/-30deg of yaw are mapped to +/-120deg (looking over shoulder) and +/-30deg pitch are mapped to +/-90deg(looking up and down) in opentrack. This magnifies any jitter 4x and 3x respectively. At 1meter distance I see a +/-2 deg jitter in the raw values. But at a distance of 4meter from a tv, the jitter is approx +/-10 deg raw value which is then magnified. When just looking down at the front instrument panel, my head is in a natural position even when not supported by a headrest so the jitter is minimal. But when looking down at the side instrument panels, there is extreme jitter which makes it really difficult to click on the context sensitive switches of the panels. I'm already using opentrack's accela filter set to the maximum.

Describe the solution you'd like From testing different configurations I can see the jitter of the raw values is causing the issue. A wider range of model type to increase MAFilter's steps (#samples) will help reduce the jitter. Under the issue "Very high CPU usage", I've proposed some optimizations which could help reduce CPU but also allow larger MAFilter steps.

Describe alternatives you've considered The cause of the issue is the large jitter which is then magnified. I can sit closer to the tv/camera to reduce jitter but it become uncomfortable being that close to the tv. I can move the camera closer to my seat but it needs to be much higher than the coffee table and then there is the issue of wires so I can't just leave it there when not flying. The same is true for the joystick/throttle/pedals. But the camera also requires a tripod or stand. I'm currently using a wireless keyboard but trying to find a way to use the cockpit panels for a more realistic simulation

AIRLegend commented 2 years ago

Yeah, those are nice suggestions :)

Regarding the jitter part. Have you tried the filters in opentrack? They can work like a charm

searching46dof commented 2 years ago

i previously could only get accela to work. I tried other opentrack filters but they somehow prevented aitrack's facetracking. I was using gain/exposure approx 60% with house lighting in the evening.

I reduced the gain/exposure to approx 40% for the brigher natural afternoon lighting and I can now use the other filters.

the hamilton filter seems to be specifically tailored for flight simulators but the maximum limits are suited more for 1 meter camera distance. I tried it with the maxumum limits and it does filter the jitter enough. The screen is stable so that I can click on the side cockpit panels with a cheap android box airmouse remote (instead of mouse or trackpad). This improves the in-game experience tremendously especially for takeoffs and landings.

The deadzone in the hamilton filter is too small so I needed to increase the deadzone via the mapping for yaw and pitch to 5deg for takeoffs and landings. I need to scan the instrument panel more often but that sometimes looking down instinctively triggers a slight head movement.

searching46dof commented 2 years ago

I just noticed the logit function wasn't handling boundary conditions correctly trying to prevent divide by zero and log(0). The input value should be restricted as a plateau. e.g. 0.99999999999 should be treated as 0.99999

This may account for some of the jitter

define FIX_logit_boundary_conditions 1

float logit(float p) {

ifdef FIX_logit_boundary_conditions

if (p >= (float)0.99999)
    p = (float)0.99999; // prevent divide by zero
else if (p <= 0.0000001)
    p = (float)0.0000001; // prevent log(0)

else

if (p >= 1.0)
    p = (float)0.99999;
else if (p <= 0.0)
    p = (float)0.0000001;

endif

p = p / (1 - p);
return log(p) / 16;

}

searching46dof commented 2 years ago

the modifications to logit and MAFilter seem to have resolved the problem with jitter of raw values. at a distance of 4meters I now observe jitter of +/-1deg which is pretty comparable to a KinectV2 it also appears to be less jumpy at the up/down/left/right extemes

searching46dof commented 2 years ago

I also want to report a very unexpected funny bug while testing this and I don't expect any resolution ;)

It seems that aitrack cannot detect the face of someone with heavy bangs on the forehead. It works when the bangs are lifted and not covering eyebrows. The same may be true of someone with thinning, very light colored eyebrows, or someone who shaved off their eyebrows.

AIRLegend commented 2 years ago

Yeah... That could be solved retraining the models. Seems like there were no subjects with those traits in the dataset :/

searching46dof commented 2 years ago

In the ProblemSolver constructor initialization, there appears to be some incorrect swapping between (width and height) for initialization of head3dScale and camera_matrix. the implementation uses the first row for y-axis/pitch and the 2nd row for x-axis/yaw. the calculation for the width field of view and height field of view isn't correct. it can be simplified using the geometry of a triangle (x^2 + y^2 = diagonal^2) since the rations of the field of views will be the same as the ratios in pixesls. Note that this assumes square pixes with a 1:1 aspect ratio.

This seems to solve the yaw and rotation jitter with the camera above the monitor and looking just above the monitor.

There still still extreme yaw and rotation jitter with the camera above the monitor and looking almost straight up. This seems to be a problem within the implementation of solvePnP since the calculated pitch is in the -120 deg range. It value should never exceed -90deg which is looking straight up. Yaw and roll do not seem to have this issue. Looking down does not seem to have this issue.

searching46dof commented 1 year ago

the function correct_rotation adjusts the solvePnP values computed for a face at the center of the viewport. however, it just needs the atan (arctangent) of the ratios (already a tangent) to obtain the angle in radians. also, since the correct yaw and correction pitch are added to the result of solvePnP, the yaw and pitch need to be limited to their maximum values of +/-90deg (orthogonal to the camera). Larger values would mean facing away from the camera.

These changes completely eliminate the extreme yaw and pitch jitter at the yaw and pitch boundaries of +/-90deg.

void PositionSolver::correct_rotation(FaceData& face_data) { float distance = (float) -(face_data.translation[2]); float lateral_offset = (float)face_data.translation[1]; float verical_offset = (float)face_data.translation[0];

ifdef OPTIMIZE_PositionSolver

float correction_yaw = (float)(std::atan(lateral_offset / distance) * TO_DEG); // (lateral_offset / distance) is already tangent, so only need atan to obtain radians
float correction_pitch = (float)(std::atan(verical_offset / distance) * TO_DEG); // (verical_offset / distance) is already tangent, so only need atan to obtain radians

else

float correction_yaw = (float)(std::atan(std::tan(lateral_offset / distance)) * TO_DEG);
float correction_pitch = (float)(std::atan(std::tan(verical_offset / distance)) * TO_DEG);

endif

face_data.rotation[1] += correction_yaw;
face_data.rotation[0] += correction_pitch;

ifdef OPTIMIZE_PositionSolver

// Limit yaw between -90.0 and +90.0 degrees after correction
if (face_data.rotation[1] >= 90.0)
    face_data.rotation[1] = 90.0;
else if (face_data.rotation[1] <= -90.0)
    face_data.rotation[1] = -90.0;
// Limit pitch between -90.0 and +90.0 degrees after correction
if (face_data.rotation[0] >= 90.0)
    face_data.rotation[0] = 90.0;
else if (face_data.rotation[0] <= -90.0)
    face_data.rotation[0] = -90.0;

endif

}

I have a question why the translation[i] are scaled (multiplied by 10 and then divided by 100 for the z-axis) ? This would exaggerate the lateral(x-axis) and vertical(y-axis) motion but minimize the z-axis motion. This is done before the correct_rotation so the correct_rotation is performed on scaled values and provides incorrect correction.

This may be the reason the yaw and pitch corrections seem a little exaggerated. It may also be the reason the yaw and pitch rotations are exceeding +/-90deg.

for (int i = 0; i < 3; i++)
{
    face_data->rotation[i] = rvec.at<double>(i, 0);
    face_data->translation[i] = tvec.at<double>(i, 0) * 10;
}

// We dont want the Z axis oversaturated.
face_data->translation[2] /= 100;
searching46dof commented 1 year ago

never mind about the last questions. the scaling translation[i] * 10 is converting to opentrack units of cm the subsequent face_data->translation[2] /= 100 reduces the scale of z-axis since there is a limit to opentrack axis ranges

searching46dof commented 1 year ago

this is fixed by various initialization issues in tracker, handling of boundary conditions and fixes in solve_rotation