firasdib / Regex101

This repository is currently only used for issue tracking for www.regex101.com
3.26k stars 199 forks source link

@firasdib Ah I see, I understand that it's how the regexes are working under the hood, but it still isn't intuitive to hover over the second surrogate of the character when the character itself is rendered as one, it is literally a 1-pixel wide hitbox to get the hover for the 2nd surrogate. #2088

Closed java08800 closed 1 year ago

java08800 commented 1 year ago
          @firasdib Ah I see, I understand that it's how the regexes are working under the hood, but it still isn't intuitive to hover over the second surrogate of the character when the character itself is rendered as one, it is literally a 1-pixel wide hitbox to get the hover for the 2nd surrogate.

What if instead you made it display both surrogates for hovering the multi-surrogate character? i.e. hovering over the example character used 𐌐 in a non-Unicode regex flavor could show a unified hover like this:

Text - matches literally (case-sensitive) the character pair: οΏ½ with index 55296₁₀ (D800₁₆ or 15400β‚ˆ) followed by οΏ½ with index 57104₁₀ (DF10₁₆ or 157420β‚ˆ)

This way the regex decomposition of the displayed Unicode characters is easily viewable and also prevents potentially misleading users unaware of Unicode semantics into thinking their Unicode characters equal to only the first surrogate they see on the current hover.

Originally posted by @jhmaster2000 in https://github.com/firasdib/Regex101/issues/2061#issuecomment-1565036374