ChorusOne / solido

Lido for Solana is a Lido-DAO governed liquid staking protocol for the Solana blockchain.
https://chorusone.github.io/solido/
GNU General Public License v3.0
101 stars 43 forks source link

Improve validator name sanitizer #580

Closed ruuda closed 2 years ago

ruuda commented 2 years ago

One of the validators in the last onboarding wave has U+26A1 HIGH VOLTAGE SIGN and U+FE0F VARIATION SELECTOR-16 in its name. In my terminal and also in Grafana in my browser, these render as a lightning bolt using an emoji font.

Also, due to the way we render labels when rendering the metrics, the variation selector turns into the literal text "\u{fe0f}" (so a backslash and hex between curly brackets, not an U+FE0F itself).

Strip code points from both blocks to fix this.

enriquefynn commented 2 years ago

Isn't it better to have the names in ASCII? There's probably some built-in function to do that

ruuda commented 2 years ago

Isn't it better to have the names in ASCII

There can be non-ascii text that I would consider to be fine, e.g. “Saint-Étienne”, “Münster”, “O’Connor”. I haven’t found those among current Solana validators though.

There's probably some built-in function to do that

This is a very hard problem and depends on what you want to do and assumptions about the source language and script too ... (E.g. do you convert ü to ue, ß to ss? Or ü to just u? Or do you drop it entirely? What about code points that have no obvious ascii alternative?) But there is https://lib.rs/crates/deunicode which does an impressive job nonetheless.

enriquefynn commented 2 years ago

This is a very hard problem and depends on what you want to do and assumptions about the source language and script too ... (E.g. do you convert ü to ue, ß to ss? Or ü to just u? Or do you drop it entirely? What about code points that have no obvious ascii alternative?) But there is https://lib.rs/crates/deunicode which does an impressive job nonetheless.

That's fair, probably is better to do what's in this PR to depend on this crate 👍