HoloToolkit Input Module: design discussion

maxouellet commented 7 years ago

This is the initial suggested design for the new input module. Feel free to ask questions / request changes.

I already have a version of this implemented, to which I will make improvements as needed. Goal is to add it to HoloToolkit so that new features can be added by the community as needed. It has been designed to be as extensible as possible.

Why design a new Input Module?

A lot of applications have a need for an efficient way to handle input. In an ideal world, game objects can handle the input events they are interested in without having to write the same code in multiple components. The current solution in HoloToolkit isn't very performant and relies on Unity's messaging system, which is not very performant and doesn't make it obvious that behaviours are handling input events. It also doesn't extend well to support multiple different input mechanisms.

Unity provides an event system that it currently leverages to send out inputs to game objects. It contains a HoloLensInputModule that interprets HoloLens input and sends out the standard Unity events. The solution I propose does not use this input module, for the following reasons:

It does not easily support gaze stabilization: instead, it always assumes that you are targeting the center point of your camera, using a raycaster attached to your camera.
It only supports sending out the standard Unity events, which are 2D events. This lacks some of the key information that is needed to manipulate 3D objects with input devices that have more than 2 degrees of freedom.

The proposed design leverages Unity's event system, but implements its own version of an input module (called InputManager in this case). This currently gives us more flexibility in exploring various interaction mechanisms, which is very helpful when working with novel inputs devices such as what HoloLens provides.

Class diagram

Overall view of the main classes and interfaces that make up the initial version of this input module. Minor changes could still be done before submission (for example, HandsInput will likely be merged with GesturesInput, as their functionalities are very similar).

inputmodule

Input Module Design

The input module is designed to be extensible: it could support various input mechanisms and various types of gazers.

Each input source (hands, gestures, others) implement a IInputSource interface. The interface defines various events that the input sources can trigger. The input sources register themselves with the InputManager, whose role it is to forward input to the appropriate game objects. Input sources can be dynamically enabled / disabled as necessary, and new input sources can be created to support different input devices.

Game objects that want to consume input events can implement one or many input interfaces, such as:

IFocusable for focus enter and exit. The focus can be triggered by the user's gaze or any other gaze source.
IHoldHandle for the Windows hold gesture.
IInputHandler for source up, down and clicked. The source can be a hand that tapped, a clicker that was pressed, etc.
IManipulationHandler for the Windows manipulation gesture.
ISourceStateHandler to handle when an input source is detected or lost. This can be triggered when a hand comes into view of the HoloLens, for example.

The input manager listens to the various events coming from the input sources, and also takes into account the gaze. Currently, that gaze is always coming from the GazeManager class, but this could be extended to support multiple gaze sources if the need arises.

By default, input events are sent to the currently focused game object, if that object implements the appropriate interface. Modals input handlers can also be added to the input manager: these modal handlers will take priority over the currently focused object Fallback handlers can also be defined, so that the application can react to global inputs that aren't targeting a specific element. Any event sent by the input manager always bubbles up from the object to its ancestors.

In recap, the input manager forwards the various input sources events to the appropriate game object, using the following order:

The registered modal input handlers, in LIFO order of registration
The currently focused object
The fallback input handlers, in LIFO order of registration

The input manager also has a pointer to the currently active Cursor, allowing it to be accessed from there. The cursor currently also depends on the gaze coming from the GazeManager class.

maxouellet commented 7 years ago

@aalmada This isn't a C# 1.0 vs C# 2.0 fight :) I meant that a delegate such as:

NavigationStartedEventDelegate(IInputSource inputSource, uint sourceId, Vector3 cumulativeDelta);

better indicates what each of the parameters represent than

Action<IInputSource, uint, Vector3>

I agree that the right solution to this is to have a single parameter to your Action / EventHandler that can be a class with multiple members.

That being said, this isn't something I care about that much, so I can switch it to Action if that's the general consensus. One interesting thing is that Unity itself uses delegate (in GestureRecognizer, for example), although I'm not sure if it's because their code is auto-generated.

paseb commented 7 years ago

cursor_update

I just pushed my changes for the cursor to match the extensible nature of Input. Look at the AnimatedCursor and SpriteCursor as a example for creating new cursors. The animator state information is set in the current DefaultCursor prefab. There are also some handy exposed surface rotation functionality to surface (see LookRotationBlend in cursor).

robertlevy commented 7 years ago

Re: eventarg allocation... you can reuse the same instance of the eventarg object each time you raise the event. Not sure if that's weird from a Unity perspective but it's a common pattern in Microsoft's implementation of .NET input APIs

On Nov 2, 2016, at 2:13 PM, Max Ouellet notifications@github.com<mailto:notifications@github.com> wrote:

@aalmadahttps://github.com/aalmada In this specific case, I'm not a big fan of using Action, because it hides the meaning of the uint. I can certainly change it if people feel it's more obvious when using Action.

It agree that EventHandler and event args for the various events would be better overall, but I was trying to avoid the allocation of an *EventArgs object instance on every interaction event, given that this will be running in Unity on HoloLens. I'm open to suggestions on this

You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Microsoft/HoloToolkit-Unity/issues/277#issuecomment-257952053, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB1xQqzrKRgHVODPTGmQZDSubppgvtIMks5q6NK1gaJpZM4KVDty.

maxouellet commented 7 years ago

@robertlevy From what I'm seeing in the .NET framework, that doesn't seem right. For example, MouseButtonEventArgs has all of its properties as read-only, and a new instance of it is constructed every time a mouse button is clicked. My understanding is that EventArgs classes are meant to be immutable (unless you're using EventArgs as-is with no parameters, in which case it doesn't matter).

That being said, the performance impact of instantiating a short lived event args when input occurs is most likely negligible in the grand scheme of thing. Given that it improves the design by allowing us to more easily add input information in the events, I'll go ahead and make that change.

robertlevy commented 7 years ago

wow it's been a while :) you're right, i got stuff mixed up. different eventarg instance each time. but the eventarg objects are fairly light weight, delegating just about everything to a device object that is shared across each args instance (example: https://referencesource.microsoft.com/#PresentationCore/Core/CSharp/System/Windows/Input/TouchEventArgs.cs)

From: Max Ouellet notifications@github.com Sent: Wednesday, November 2, 2016 7:45 PM To: Microsoft/HoloToolkit-Unity Cc: Robert Levy; Mention Subject: Re: [Microsoft/HoloToolkit-Unity] HoloToolkit Input Module: design discussion (#277)

@robertlevyhttps://github.com/robertlevy From what I'm seeing in the .NET framework, that doesn't seem right. For example, MouseButtonEventArgs has all of its properties as read-only, and a new instance of it is constructed every time a mouse button is clicked. My understanding is that EventArgs classes are meant to be immutable (unless you're using EventArgs as-is with no parameters, in which case it doesn't matter).

That being said, the performance impact of instantiating a short lived event args when input occurs is most likely negligible in the grand scheme of thing. Given that it improves the design by allowing us to more easily add input information in the events, I'll go ahead and make that change.

You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Microsoft/HoloToolkit-Unity/issues/277#issuecomment-258033246, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB1xQqD7v0lq9NBEPLOAaErztc6ci1NHks5q6SB9gaJpZM4KVDty.

aalmada commented 7 years ago

@maxouellet I'd like to comment on the current IInputSource implementation. It assumes two modes: Manipulation and Navigation. Will there always be only these two modes? Any change on this assumption will force changes either on the interface or in the InputManager. A generic mouse handler (or multi-touch) doesn't have Manipulation or Navigation events. These would be states implemented on top of the more generic up, down and move events.

maxouellet commented 7 years ago

@aalmada I think (and hope) that the IInputSource will eventually evolve to have more interactions, as they become necessary. One example that is currently missing is some kind of ControllerInputSource, which would expose button presses + joystick information (this might end up just being an extension to SourceUp / SourceDown).

The Manipulation and Navigation events are there specifically to expose the default Windows gesture events. I added the SourceUp, SourceDown and SourceClick events for the purpose of supporting other types of interactions that might not currently be built into the gestures APIs. The HandDraggable script is a good example of that scenario. Or if you wanted to implement some kind of two-handed manipulation, you could leverage the up and down events accordingly. Note that because of how this is designed, there is no need for a Move event: the input source itself is passed to input handlers, so that they can query the input source's position (if it has one) whenever they need it.

aalmada commented 7 years ago

I have a 3D model designed in CAD that is structured as a hierarchy with many leaf objects with a mesh and its collider. I want to be able to hand drag entire branches of this hierarchy instead of individual objects. InputManager sends the messages to the focused object. I can make this work if I add a HandDraggable to each leaf object and set HostTransform to the parent branch node. The issue is that adding HandDraggable to each leaf object is very tedious. Is there an alternative?

maxouellet commented 7 years ago

The events bubble up according to your game object hierarchy, so you don`t have to have the HandDraggable script (or any other script that implements the input interfaces) on the game object that has the collider. In your case, assuming you have something like the following:

Root Branch1 Leaf1 Leaf2 Branch2 Leaf3 Leaf4

You can put the HandDraggable script on Branch1 and Branch2, and that will work with the individual colliders that are in their children.

The one exception to this is if the leaves are already handling the same input event. Each input callback passed in an *EventData object that as a Use() function. If something calls that Use() function, the event is considered handled, and thus it will stop bubbling up the game object hierarchy.

Let me know if that doesn`t handle your scenario!

aalmada commented 7 years ago

@maxouellet Works just like you explained. Thanks!

aalmada commented 7 years ago

@maxouellet KeywordManager doesn't have the same bubbling mechanism. It calls a specific method in a specific object. I have a workaround but I think it would be interesting to make all these managers consistent with how InputManager invokes events.

maxouellet commented 7 years ago

@aalmada I agree that would be great, that work just hasn't been done yet, so I left the original classes in there. There are a few things that could be added when the need arises, notably:

Keyboard input routing
Controller input routing
Voice commands input routing (that one is particularly tricky to get right...)

I do not intent to address those soon due to lack of time, but will be happy to review anyone's changes that tries to address them. I think keyboard and controller should be relatively straightforward to do.

aalmada commented 7 years ago

@maxouellet fair enough

aalmada commented 7 years ago

I do not intent to address those soon due to lack of time, but will be happy to review anyone's changes that tries to address them.

@maxouellet Challenge accepted! ;) While researching on how to extend the InputManager I noticed there are two different patterns. GazeManager is explicitly registered in the InputManager while all others, like GestureInput, derive from BaseInputSource. Why doesn't GazeManager follow the more generic second pattern?

StephenHodgson commented 7 years ago

GazeManager is the one that handles most of the logic about which objects we're currently focused on, and many of these other classes utilize that information. If you look at GazeManager.RaycastUnityUI, you'll see we actually utilize the built in EventSystem too.

It seems that there's two ways input events have to be handled in order to get things to work correctly.

aalmada commented 7 years ago

@HodgsonSDAS Yes, I do understand that. My question is more like, is really GazeManager a special case? If not, the architecture would be somewhat simpler. Just food for thought. ;)

StephenHodgson commented 7 years ago

Oh I agree simpler is better, I'll have to take a closer look at it later. nom nom nom.

maxouellet commented 7 years ago

@aalmada Yeah, I didn't have enough time to figure out a great way to implement the GazeManager without breaking everything that depends on it being a singleton...

The long term solution I was thinking was that we should have a one or many FocusSources that register themselves with the InputManager. GazeManager would be one implementation of a FocusSource (a very important one that we might want to keep as a SIngleton), but you could have others (a 6DOF Vive controller, for example). Then, every input event would not only include the input source data, but also the FocusSource that triggered the event on the object.

paseb commented 7 years ago

Well there goes my reply :), @maxouellet beat me to it. I was going to say what max wrote. Because we should support multiple input focus sources having them independent and register with input manager is a good choice. We internally separated out a users GazeFocus from the other FocusSources since where the head is looking tends to be complimentary to other focus sources. Examples would be gaze contextual voice commands, gaze based rendering and culling, etc.

aalmada commented 7 years ago

@maxouellet Is IInputSource.SupportedEvents used somewhere?

maxouellet commented 7 years ago

@aalmada Not right now, it's there as a convenience in case someone ever needs it. Eventually, as need arises, we might also want to add an InputType enum that tells you whether the input is Hands, Controller or whatever else the Unity APIs currently support.

aalmada commented 7 years ago

There are a few things that could be added when the need arises, notably:

Keyboard input routing Controller input routing Voice commands input routing (that one is particularly tricky to get right...)

@maxouellet I was revisiting your post and thinking that the KeywordManager and the SpeechInputSource handle voice commands but also keyboard input. The issue is that they assume that there is always a voice command associated with a key. To make this fully agnostic and reusable maybe there should be the notion of a command that can be mapped to triggers from input sources.

yacuzo commented 7 years ago

The diagrams in this discussion are great for learning the new system. Could they be put somewhere intuitive, and have the Input/readme.md link to them?

Edit: Good work on this new system! Also, upgrading from the old system was a lot less work than i thought it would be. I ended up with less code than before, yay.

StephenHodgson commented 7 years ago

@yacuzo https://github.com/Microsoft/HoloToolkit-Unity/pull/376

StephenHodgson commented 7 years ago

@maxouellet Does OnInputClicked happen before or after OnInputUp?

And is there a way to change the speed for clicking? I noticed that if I click and hold for a certain amount of time the OnInputClicked doesn't fire.

maxouellet commented 7 years ago

@HodgsonSDAS OnInputClicked is not guaranteed to happen before or after OnInputUp. This is by design, because OnInputClicked is a OS-level gesture. The clicking speed is defined by the OS, so you can't modify it. It is also by design from the OS that if you click and hold for a certain amount of time, the OnInputClicked doesn't fire. This is all defined by the OS gestures: when you hold for a certain time, it ends up triggering a Hold or a Manipulation/Navigation event.

If you need to customize a click somehow (which I would not recommend, as this would go against the OS definition of a click and might break some user-specific accessibility settings down the line), you could write your own input source that has your own implementation of a click.

StephenHodgson commented 7 years ago

No worries, I just wanted to make sure I understand the timing. Thanks for the info.

maxouellet commented 7 years ago

Closing this issue, as this has been integrated to HoloToolkit for a while now.

basuarya commented 7 years ago

@maxouellet Hello, I would like to know how to rotate a gameobject using ManipulationHandler. Please help? Can I use the same to rotate an object using pinch drag?

vbielkin commented 7 years ago

@maxouellet, InputManager supports some of Unity UI Events (like ExecuteEvents.pointerClickHandler, ExecuteEvents.pointerEnterHandler, etc.) but there is nothing about ExecuteEvents.beginDragHandler and ExecuteEvents.dragHandler. As the result, using Unity ScrollBars and ScrollRect with HoloToolkit InputManager becomes impossible. Are these Unity event handlers going to be supported in future?

maxouellet commented 7 years ago

@vbielkin Support could be added by anyone who needs that functionality. The clean way to implement it would be to have InputManager interpret the gestures navigation event and send the appropriate drag events (like ExecuteEvents.beginDragHandler that you pointed out).

Alternatively, you could simply attach a script to your Unity scroll bars that has a reference to your Unity UI Scrollbar and implements the INavigationHandler interface. From that script, you could then smartly manipulate the Value of the scrollbar to scroll it appropriately based on the navigation input you receive.

I don't think I'll get to add native support for drag events to InputManager any time soon, but if someone wants to try it, they are welcome to do so! Otherwise, the alternate solution I proposed should be relatively simple to implement in a project.

microsoft / MixedRealityToolkit-Unity