ITDP / the-online-brt-planning-guide

Online collaborative version of the BRT Planning Guide
https://brtguide.itdp.org
Other
6 stars 20 forks source link

Handle non-breaking spaces #137

Closed jonasmalacofilho closed 7 years ago

jonasmalacofilho commented 7 years ago

First, I'm not sure we're going to allow them. While they might be useful, they can really screw up correct line breaking, and we don't have any other manual line breaking primitives.

Even if we do decide to support them, we should use instead a visible sequence (such as \nbspace). It's really annoying to depend on invisible characters. (although we could still support UTF-8 encoded NB spaces for consistency)

Either way, let's refuse them in the Lexer for the time being.


I've been removing all current cases by hand. Still, it's really annoying to check and handle these spaces every time because of their invisible nature.

For reference, I've been removing them with

sed -i -e 's/\xc2\xa0/ /g' guide/**/*.{src,manu,txt}

and then manually diffing and checking the results.

However, if I remember correctly, there are other UTF-8 sequences for these spaces (or similar invisible yet special characters)... We really should treat these cases in the Lexer.