Create a system to bind (map) input gestures to output actions.
Inputs and outputs can be discrete or continuous.
Inputs (Gestures):
Head tracking X/Y (continuous) - based on optical flow point tracking of points on face
Head horizontal/vertical tilt (continuous) - angle of head detected directly from FaceMesh, can be used to control joystick
Tilt head left/right/up/down (discrete) - thresholds of the above, useful to control D-pad or WASD/arrow keys or scroll wheel
Eye gaze X/Y (continuous)
Squint left/right/both eyes (continuous)
Blink left/right/both eyes (discrete) - threshold of the above
Vocalizations such as pop, click, sss - mouth sounds, requires mic input
Raise left/right/both eyebrows (discrete or continuous)
Smile / Raise left/right/both corners of mouth (discrete or continuous)
Tongue left/right/up/down (discrete) or X/Y (continuous)
Open mouth (discrete or continuous)
Project Gameface has one called "Roll lower mouth" which I wasn't sure how to interpret, but it seems to be detect folding one lip over the other (either lip) and perhaps flattening the mouth
Project Gameface also has "Mouth left/right", which I thought looked like shifting your mouth left/right, but it actually responds more to raising your cheeks (half-smiling), which I already noted. Shifting your mouth left/right could be a separate gesture, although it seems harder to do.
Possibly specific mouth shapes, like puckering (kissy face), square (exaggerated "th" sound shape? with lips pushed out)
Dwell at location on screen - for dwell clicking
Outputs (Actions):
Keys, with optional modifiers (discrete)
Mouse buttons, with optional modifiers (discrete)
Double click (discrete)
Scroll up/down/left/right (discrete or continuous)
Move mouse (continuous)
Reset mouse position to center (discrete)
Gamepad buttons (discrete), triggers (discrete or continuous), and joystick axes (continuous)
Run Command (discrete) - would require a text input
Slow cursor movement (discrete or continuous) - for enhanced precision pointing, temporarily reduce the speed of the cursor. May be good to use with squinting. (Even if blinking is used for clicking, slowing down shouldn't interfere with clicking.)
Zoom in on screen (discrete (or continuous?)) - for increased accuracy, similar to slowing cursor movement, magnify the screen around the cursor. I was picturing this like the Magnifier tool in MS Paint, but it could also potentially be a continuous zoom, although I suspect that wouldn't work terribly well? But zooming in is very useful for precision, especially when using eye gaze.
Switch profile/mode (discrete) - switch to another configuration profile; this can be used to switch between using head tilt for arrow keys in a game, to mouse movement, on the fly. This allows advanced users to create a state machine.
I've inconsistently described some things that can be discrete or continuous as separate (squint/blink) or the same input.
Probably it makes sense to just have continuous inputs that you can adjust the threshold of, or use as a continuous output.
When using as a continuous output, it might make sense to provide a range mapping.
But basically, I want to design the binding system to be fairly open-ended, supporting different UI for different inputs/outputs, e.g. "dwell" may require multiple thresholds, and "run command" uniquely requires a text input.
How should inputs that can be discrete or continuous be mapped to an output? If mapped to a continuous output, they don't need the threshold (or may benefit from a range mapping instead), so it would be weird if the threshold was made part of the input. I guess it needs to be part of the binding itself.
Side note: it would be nice if profiles could inherit from each other. Otherwise, it would be very finicky maintaining threshold values duplicated across different profiles used as operational modes.
Also, it should be an error to configure a profile to switch to itself.
Create a system to bind (map) input gestures to output actions.
Inputs and outputs can be discrete or continuous.
Inputs (Gestures):
Outputs (Actions):
I've inconsistently described some things that can be discrete or continuous as separate (squint/blink) or the same input. Probably it makes sense to just have continuous inputs that you can adjust the threshold of, or use as a continuous output. When using as a continuous output, it might make sense to provide a range mapping.
But basically, I want to design the binding system to be fairly open-ended, supporting different UI for different inputs/outputs, e.g. "dwell" may require multiple thresholds, and "run command" uniquely requires a text input.
How should inputs that can be discrete or continuous be mapped to an output? If mapped to a continuous output, they don't need the threshold (or may benefit from a range mapping instead), so it would be weird if the threshold was made part of the input. I guess it needs to be part of the binding itself.
Side note: it would be nice if profiles could inherit from each other. Otherwise, it would be very finicky maintaining threshold values duplicated across different profiles used as operational modes. Also, it should be an error to configure a profile to switch to itself.
First I need to implement:
Related: