PostHog / posthog

🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
https://posthog.com
Other
21.17k stars 1.25k forks source link

Nicer errors when user entered an invalid regex #4680

Open macobo opened 3 years ago

macobo commented 3 years ago

Is your feature request related to a problem?

Writing a correct regex is hard. Currently when user manages to write an invalid regex, they get an 5xx error back from our API.

Describe the solution you'd like

Let's instead show them a warning on the frontend?

Describe alternatives you've considered

Additional context

Thank you for your feature request – we love each and every one!

sentry-io[bot] commented 3 years ago

Sentry issue: POSTHOG-2X8

sentry-io[bot] commented 3 years ago

Sentry issue: POSTHOG-2X7

sentry-io[bot] commented 3 years ago

Sentry issue: POSTHOG-2V7

Twixes commented 3 years ago

ClickHouse and the plugin server use Google's RE2 for regex matching. It does have a Node bindings package, but it doesn't work in the browser, as in the end it's a C++ binary. We could use JS standard RegExp client-side instead which would largely be OK, though it's actually somewhat more lenient than RE2, so some 500s would still slip in. Alternatively this can be checked server-side when setting a regex field in Python, using google-re2, in which case validation would require a request, but would be fully reliable. Postgres is a slightly different story – its regex syntax (POSIX) differs in a few places, though not greatly. In its case only a Postgres query would really reliably verify a regex. However I'd argue we should stick to RE2 validation in this case too.

macobo commented 3 years ago

I think (at least) starting with some js-RegExp validation is a valid start, always can add complexity later if a lot of errors still.

clarkus commented 3 years ago

Related https://github.com/PostHog/posthog/issues/4326

paolodamico commented 3 years ago

As this seems to have caused confusion a bit of times and affecting funnel usage, adding to the priorities list for this sprint. We should consider the extra context from #4326 (i.e. let's add detailed instructions about the kind of operators that are supported).

clarkus commented 3 years ago

Let's focus this issue on addressing clearer instructions around regex. Secondary to that, we could use a simple error pattern for input fields to highlight errors and communicate how to recover from them. I filed https://github.com/PostHog/posthog/issues/5380 to track this work.

clarkus commented 3 years ago

I am working on this today and tomorrow. Do you have any guidance on what makes for a great regex authoring experience? So far I am tracking simple validation based on syntax. Would we also need some way to see how the expression evaluates to ensure that it's accurate AND well-formed? Secondary to this, I was planning on adding a simple syntax key or cheatsheat to show what properties we support and possibly even provide shortcuts for applying them into the input field that contains the regex. Any thoughts on what else we can address here? Thanks!

paolodamico commented 3 years ago

I sometimes use https://regex101.com/ because it visually shows the match groups and makes it easy to build and parse a regex string.

Here's the syntax we use in CH & plugin server. I would keep the cheatsheet to the most basic expressions and then link their docs.

Whenever we evaluate an expression I'm not sure how detailed errors we'd be able to provide on malformed expressions, but I would just show an error string.

Tagging @Twixes for any additional input

clarkus commented 3 years ago

Here are some ideas that are mostly intended to facilitate discussion, not so much prescribe a solution. Here's my thinking so far (please correct me if I'm wrong):

My biggest question is how likely are users to author complex regex patterns in the app? It seems like there are a lot of popular single serving sites that are really good at this now. We could try to approach that level of experience with our regex editor, or we could just make it easy to copy and paste text into those fields, and alert users of malformed values. Secondary to that, are there any common regex patterns we think users will want to use? Are there any kind of utility patterns that we could make available in the UI? The idea here is to make writing a regex easier by avoiding writing the regex entirely.

Some ideas for inline options. Validation would occur on blur / de-focus, or when the user clicks "reevaluate". It was about this time I realized a textarea might be a better experience.

Screen Shot 2021-08-04 at 3 49 27 PM

Our ability to summarize a complex regex is also a bit of a challenge. A summary view and a modular editing view might be worth considering as well. In this case, once a user selects the matches regex operator, when the value field is focused, we could trigger a modal containing a large-form editor. This would allow for both simple and complex patterns.

Screen Shot 2021-08-04 at 3 46 16 PM
Twixes commented 3 years ago

Syntax highlighting would be great, and we even already use a library in the product that can do this (react-syntax-highlighter which uses Prism.js under the hood). BUT unfortunately it doesn't work with inputs or textareas. They don't allow such styling so we'd have to engineer our own component that would work exactly like input – only by displaying text using raw HTML (a problem I also recently faced in https://github.com/desktop/desktop/pull/12465 🙈 ). I'd skip that for now as that would be a lot of effort for one component.

As for error messages, I definitely think they could be more useful by being specific. In fact I just made them such in #5456, which I think looks pretty fine.

I think validation should be automatic (and non-obtrusive, deblur/debounce should be good), and your mocks look really nice @clarkus. Much more coherent than the current error state message thrown in quickly. I would actually argue the option to edit a filter value in a textarea modal could be generalized to all operators in a useful way (though a bit concerned about this UI for operators where a filter can have multiple values).

Personally for the really complex expressions I use https://regexr.com/ and they have a nice cheatsheet. We definitely can't match the power of that app… but we could put a similar cheatsheet in our one.

Regexxxx

clarkus commented 3 years ago

Awesome feedback. Thank you, @Twixes.

Could we do some input fakery with a content-editable attribute on a div or some other element? That would probably get us closer to allowing structure inside that container element and hopefully then some syntax highlighting. As long as we could pull the string out of the element, it should work?

Twixes commented 3 years ago

We could do some fakery but then the input experience would be impacted (i.e. cursor moving, selection, etc.). That's why I'd rather go for the less tricky improvements first.

clarkus commented 3 years ago

I did some quick searching and found what might be a viable option that uses multiple elements to emulate an editor with syntax highlighting. https://css-tricks.com/creating-an-editable-textarea-that-supports-syntax-highlighted-code/

I'm not going to pursue that right now, but wanted to share it before I forgot. Maybe we can aim for something like this in the future. I'm going to look into incorporating modular editors for other operators as well. Thanks again for the input!

clarkus commented 3 years ago

I thought of another question How critical is it to have a sample string or representation of the property value being affected by the regular expression? Seems like having some sample might aid in writing a regex. I'm just now sure how representative the sample could be in this context.

clarkus commented 3 years ago

Couple more iterations that focus on the modular editor. I've moved validation to a standard position and made the reevaluate action consistent. The syntax reference is also placed opposite the label so that it doesn't conflict with a more verbose validation message. I've also used some subtle code formatting within the validation string to highlight the portion that represents the invalid characters.

Available in figma at https://www.figma.com/file/9yWtngNb1AIuf6KmXaEPJA/App-doodles?node-id=923%3A134375

Screen Shot 2021-08-05 at 1 17 06 PM
paolodamico commented 3 years ago

That last design looks amazing! @Twixes correct me if I'm wrong but I think we can evaluate on the go? Also by syntax highlighting you mean highlighting each match group so you can better understand how the Regexp is constructed?

Twixes commented 3 years ago

We can definitely evaluate on the go, though without some debouncing it would be annoying. Yup, syntax highlighting would help see match groups, but also even more importantly I think, special characters.

clarkus commented 3 years ago

I don't think we should validate on every key press. That could be really distracting when you're just trying to type in the expression you know you want.

@macobo's idea from https://github.com/PostHog/posthog/issues/5380#issuecomment-892408580 could be worth considering?

User has not typed for the past N ms

Twixes commented 3 years ago

Yup, that's debouncing, 500 ms is my go-to value in such cases (but can always be tweaked).

clarkus commented 3 years ago

Thanks, @Twixes. You taught me something new.

clarkus commented 3 years ago

Let's move ahead with the modular regex editor. We should trigger the modal when the user selects one of the regex operators from the list, or if the modal is closed and a regex operator is already selected, when the value field is selected. Once the regex is validated, on submit we will insert that regex value into the value field in the filter row. Said another way, the modal is replacing any kind of dropdown or other interstitial selection component.

Screen Shot 2021-08-17 at 12 27 27 PM Screen Shot 2021-08-17 at 12 34 54 PM Screen Shot 2021-08-17 at 12 35 31 PM
paolodamico commented 3 years ago
  1. Made a similar comment on #3375. Think it's worth having the option of users typing their regex expression directly in the main input (particularly thinking when you're using it in funnels) for when you want something simple. But then the user can have the option to prop open this modal to have a clearer workspace to create something complex.
  2. Should we get rid of the reevaluate button then?
  3. For the supported properties I think we could have an in-product popover with the list of expressions and then maybe an alternate option to go to the full docs. Wdyt?
clarkus commented 3 years ago

I think conditional editing modes is an interesting idea. I can work on some iterations for that. The reevaluate button was really meant to let a user test their regex without having to go through the whole debounce / blur to trigger validation. Some manual means of triggering that check could help users trust that they're entering a meaningful value here. An in-product popover could be helpful. Do we have that content or can we produce it to test this idea? It's really a question of how complex that cheat sheet could be.

The core change here is validation messaging. If we can ship that, then we could watch how users react and adjust our next iterations based on that. I'm not sure the other pieces are necessary to really improve the experience... I mean they would, but if users are authoring regex on external tools and copy/pasting into the control, then the cheat sheet and manual validation controls aren't as necessary.

clarkus commented 3 years ago

Here's an inline version of the regex editor. This removes the validation button. I am still working through how we might trigger a modular version of the editor.

Regex - inline

https://www.figma.com/file/9yWtngNb1AIuf6KmXaEPJA/App-doodles?node-id=1076%3A135967

Twixes commented 1 year ago

+1 based on ZEN-3909 (in particular we only return a server error when you try to use lookahead or lookbehind, which isn't supported by re2)