alacritty / vte

Parser for virtual terminal emulators
https://docs.rs/vte/
Apache License 2.0
242 stars 56 forks source link

Discard CSI sequences with too many parameters #26

Closed VojtechStep closed 4 years ago

VojtechStep commented 5 years ago

CSI sequences that don't fit into the internal buffer don't get reported as valid sequences to the Performer. Instead, such cases get reported as a warning, which is handled based on the calling binary's usage of different available frontends for the log crate.

Note that we cannot print the entire invalid sequence, because the rest of the parameters aren't in the buffer, and if they were there would be no reason for this PR, so only the first 16 parameters are reported.

Fixes #24

chrisduerr commented 5 years ago

I've wanted to play around with an alternative implementation for a bit, but once I was done I had already made some pretty significant changes. So I thought the best way of sharing would be to just push to this PR. Please note that I'm not trying to make a definite statement on what this PR should be here, this is simply my proposal for improving on your work. With the powers of git we could still just revert the changes if we would wish to do so.

Now to explain the changes I have made. The most uncontroversial stuff I've changed is probably that I've fixed all clippy issues and I refactored tests a bit and added a test specifically to check that MAX_PARAMS itself was still supported, not only that MAX_PARAMS + 1 is not.

I've also kinda switched up your logic a bit. Instead of checking for MAX_PARAMS and then setting a flag that we've reached the limit, I now just check num_params to see if we should throw away the escape. We can do this since we will always push a new escape sequence. If the escape ended with ;, we will push a new one with 0 before dispatching, otherwise we will dispatch the queued escape. This allows us to just fill up the params array all the way and check if we've reached our limit before dispatching, getting rid of the unnecessary variable.

Now for the thing I'm most conflicted about. I've removed the log dependency. This has various reasons.

First of all, it's a depedency to something that has worked completely without depedencies for now (utf8parse is part of the repo). For me personally this is not a good enough reason alone to block it, however it raises the bar a bit. Going from no external dependencies to one will mean we have to continually update vte to make sure that we do not cause duplicate dependencies downstream in Alacritty.

Another reason is that currently no child logs are printed in Alacritty at all. This is because there's just so much stuff logged, that it would trash Alacritty's log output. So if we just make this change, it would require explicit downstream patching to actually get any output of these errors.

The most significant argument though is that I question how useful this message would be. We don't actually want it as a warning, since that would pop open the message bar and we do not do that for unsupported escapes. What we would want to do is just log it. However these messages are only logged when logging is explicitly enabled, which even for debugging is rarely the case.

So the only scenario where this would be useful is if there is an application using more than 16 parameters (already unlikely), which causes a bug that directly impacts the user enough to notice and report it, leading to us guessing that this might be the issue (at which point we could probably just check manually), asking them to provide a debug log and then finding this message to identify and confirm the issue.

I wouldn't bet on that never happening, however I think the chances of this ever helping us are extremely, extremely slim. So with that considered and taking all the other factors into account, I think our best bet is to just ignore the escape and keep going, assuming that whenever this happens, it's either malicious or an accident.

However I can see the other side of it as well, so I'd be curious what @nixpulvis thinks too.

VojtechStep commented 5 years ago

I wasn't sure about the logging either, but because all the other emulators I worked with logged it, I thought we might as well.

However I agree that it adds maintenance cost and it would be nice to stay without dependencies.

Add it behind a feature flag maybe?

chrisduerr commented 5 years ago

Add it behind a feature flag maybe?

Since Alacritty is pretty much the only consumer of this crate for whom this is relevant, I don't think it makes sense to add features unless they're used by Alacritty. At least not unless they're explicitly requested.

VojtechStep commented 5 years ago

Among the crates dependent on vte is a repository called ransid, which AFAIK is a terminal emulator for RedoxOS and has over 10,000 downloads on crates.io. It also uses the log crate and logs invalid control sequences in debug, so they might be interested in such a feature.

chrisduerr commented 5 years ago

Orbterm is a decent point, however I still think if we make it a feature it should be requested if desired, rather than just providing it with the expectation that someone might potentially need it.

jwilm commented 5 years ago

One alternative to logging here directly would be to add another Perform method which is passed the garbage data instead of the csi dispatch method. If the implementor doesn't want to log or anything (ie blank impl), then it just gets compiled out. We could log if desired in Alacritty.

chrisduerr commented 5 years ago

Actually, now that you're mentioning it, couldn't we just set ignoring to true? It seems like the purpose of ignoring is kinda similar this.

It might also make it possible to leave it up to the application if it wants to discard the escape sequence or if it wants to parse whatever it got. So that could actually be a big advantage.

nixpulvis commented 5 years ago

The log crate is somewhat standard, but this would allow users of this library to always choose how they want to handle this case, be it a log, println!, or anything else. Given @chrisduerr's point about going from effectively 0 dependencies to 1 is a significant change, even if the log crate is widely used.

As for ignoring, it seems that's currently only used for intermediates. So this might work, but it would be changing the meaning to include ignoring params as well. What issues might this cause in client code? Unless I'm missing something, Alacritty doesn't use this value for anything but calls to unhandled! currently.

chrisduerr commented 5 years ago

As for ignoring, it seems that's currently only used for intermediates. So this might work, but it would be changing the meaning to include ignoring params as well. What issues might this cause in client code? Unless I'm missing something, Alacritty doesn't use this value for anything but calls to unhandled! currently.

It would certainly be a breaking change. But Alacritty basically just ignores the escape sequence when ignoring is set.

This certainly would be a big change in what ignoring does, but I think usually you would want to handle these two very similarly. The only big issue would be that you currently can tell exactly why ignoring was set and if we would use it for multiple things, it could be any of those things causing the flag to be set.

I'm not quite certain where the ignoring flag came from. I'm not sure if this is a standard thing with a commonly known name or if it was just made for this purpose in the API of the VTE crate.

If this is just a VTE thing, @jwilm's approach might actually improve the API a bit and allow us to remove all ignoring flags from all the methods in the crate and instead have just one 'catch-all' error handler which can be notified with a specific cause about why an escape couldn't be parsed.

VojtechStep commented 5 years ago

Looking at the description of the state machine this crate is based on (link), the ignore state is meant to be used for exactly this purpose: to consume the rest of a sequence that should be considered invalid.

Obviously because the standard does not set any limit to the number of parameters, there is no mention of what to do with long sequences, but I feel like we've reached the consensus that it's an invalid sequence.

chrisduerr commented 5 years ago

Looking at the description of the state machine this crate is based on (link), the ignore state is meant to be used for exactly this purpose: to consume the rest of a sequence that should be considered invalid.

This is exactly what I was looking for. So as I've suspected the ignore flag is a common thing that probably shouldn't be abused too hard.

However it does sound a lot better than I had feared. Following this doc, it seems to hit the nail exactly on its head, we'd just have to fix our documentation.

The biggest problem then would be to clearly communicate this as a breaking change. Since people will likely be unaware even if we bump the version number accordingly.

VojtechStep commented 4 years ago

@chrisduerr I rebased on master, implemented the functionality using the ignore flag, added relevant test cases and extended the documentation for csi_dispatch. Please let me know if there is anything else for me to do.

VojtechStep commented 4 years ago

I addressed most of your points, except for the dot in tests. I only copied what was in the existing test case above and IMO there should be another PR for unifying how the tests look, as I see it as out of scope for this one. I will probably look into it during the holiday.