philc / vimium

The hacker's browser.
https://chrome.google.com/webstore/detail/vimium/dbepggeogbaibhgnhhndojpepiihcmeb
MIT License
23.49k stars 2.49k forks source link

Questions about the handler stack #2491

Open eejdoowad opened 7 years ago

eejdoowad commented 7 years ago

I'm just wondering about the wisdom behind it. I've stared at the Vimium codebase probably more than anyone who isn't a core contributor, and I can't wrap my head around it.

What are its advantages? I'm implementing my own extension like Vimium, and I use a simple state machine, and it seems to work fine, which scares me. I feel like I'm overlooking some detail that will come back to haunt me. I've implemented Command, Text, and Hints modes. Is there something about the other modes, or some particular feature that necessitates the control flow provided by the handler stack?

mrmr1993 commented 7 years ago

Is there something about the other modes, or some particular feature that necessitates the control flow provided by the handler stack?

Certainly not. Notably, see #1413 vs #1425 for my desperate attempts to stop this from happening in the codebase. (See also this comment for an example of a bug that still results from this decision.)

I use a simple state machine, and it seems to work fine, which scares me. I feel like I'm overlooking some detail that will come back to haunt me.

My mental model of Vimium is essentially an FSM. When the handler stack disagrees with it, it's nearly always a bug, or at least an edge case that needs discussion. Vimium doesn't currently do anything beyond the scope of a state machine, and so I expect you will be fine using one.

I'll add that the handler stack makes control flow opaque, which has made my writing, reviewing, and debugging of code much harder than it needed to be. It was my biggest reason for abandoning the project for the several months that I did.

I would strongly recommend against adding a global handler stack to your codebase. For a project like this, you will only suffer for it.

smblott-github commented 7 years ago

@eejdoowad... @mrmr1993 and I disagree about this. The use of the handler stack makes it easy to write reusable code (for example, the code to exit a mode on Escape is implemented just once) and to have functionality implemented in a self-contained way. For example, find mode works in normal mode (of course), but also in visual mode, and in caret mode. Just push its handlers onto the handler stack, and it works.

The approach has also allowed is to add new commands and features without having to weave them into the existing programming logic. Examples are grab back focus and pass next key. We just push the mode with its handlers on to the stack and it works.

What's really happening is that we're putting the logic of @mrmr1993's FSM in data (not in program logic). I guess it's a philosophical difference, but data is usually easier to maintain than logic.

eejdoowad commented 7 years ago

@mrmr1993. thanks for allaying my concerns and the explanation.... breath of relief. Given that it's not absolutely needed, add my vote to the handler stack doesn't make sense group. I forked Vimium in the past to add voice commands, and trying to figure out how the handler stack fit into the picture was the biggest challenge. I tried to figure out how it worked, but gave up, made changes, and hoped nothing broke.

It would probably be a huge overhaul to rework Vimium's control flow, but there are major advantages to adapting the approach I've taken: a state machine with exactly one active state (corresponding to the active mode) that transitions on DOM events and messages.

This is an example of what debugging mode changes using a state machine is like. The code for handling mode changes is all in a single ~200 line file.

image

Seriously, take a look at how much simpler it becomes: Command Mode, Text Mode, mode changes.

EDIT:

@smblott-github, I don't think the reusability is unique to the handler stack. A mode can easily pass an event to some other mode's handler. The difference is the handler stack makes it implicit (the mode becomes unaware of the event), whereas a state machine makes it explicit. And unfortunately, implicit control flow leaves everyone but the one who wrote it scratching their heads. It makes it hard to reason about how the program works and the cause of errors. Find mode working in normal mode (of course), but also in visual mode, and in caret mode by simply pushing find mode's handler(s) on the stack is bad because:

  1. Implicit control flow means developers will often be confused about why the event isn't reaching the conceptually active mode because the event is blocked by find mode.
  2. The active mode loses the flexibility to customize how find mode behaves.
  3. Someone still has to remember to (and write code for) pushing and popping handlers from the handler stack somewhere. I think the current solution is to ensure modes inherit from other modes which takes care of pushing and popping handlers... which is not good.
  4. The actual implementation is insanely convoluted. Modes, Handlers, Prehandlers, posthandlers, temporary handlers, key counting, arbitrary inheritance, 7 different ways to a propagate an event down the stack, multiple handlers of the same type created at runtime on the stack... please give Vimium's maintainers my condolences.

Things like Escape mode exiting, grab back focus and pass keys, which might complicate modes if always explicitly handled, can be addressed with 'middleware' that intercept events before they are sent to the active mode's handler. This introduces complexity, but at least the default is simplicity, not complexity. After thinking about it some more, I've reaffirmed this strategy. Grab back focus and pass next key are conceptually NOT modes. Rather they modify how events should be treated (e.g., ignore focus events and send certain keys directly to the page respectively). The underlying mode should conceptually be the same before and after after these anomalous events. But fear not for the stack. It's a natural data structure for middleware.

I don't really understand how the data vs program analogy applies. Seems like program either way to me. At a high-level, a state machine is much more data-like since you're defining a state transition table of size S x E at compile time, where S is the number of states and E is the number of events. The handler stack, by contrast, has an unknown number of possible permutations (infinite), and you're left to figure things out at run time.

--

Concrete details about middleware.

If a mode's job is to define the function nextStateFunction :: event -> state, then middleware can be thought of as the transformation function shadow :: event -> nextStateFunction -> state.

Middleware can be implemented as a stack of handlers... (sounds familiar). When an event first arrives, it is first checked against the handlers in the stack, one at a time. A middleware handler can either 1) Return the next mode, or 2) pass the handler down the stack. If an event passes through all middleware in the stack without a new mode being determined, then it is passed to the active mode, which must return the next mode.

I did a basic (60 line) middleware implementation in my project that just has a single, always-installed middleware handler - no stack. Initially, it took care of pass keys, escape keys, and grab back focus. But then I realized that escape key handling really should be managed by the active mode, and pass keys aren't what I thought they were and are irrelevant to my extension (which will support complete keybinding configurability per domain). 8 lines were lost. The implementation is rudimentary and doesn't account for multiple frames or javascript page changes.

--

Conclusion: When Modes have more lines for managing the handler stack than for their own functionality, and when only two or three people actually understand how the handler stack work, you've got a problem. The state machine architecture is a simple alternative that sacrifices no functionality.