When chained with the SmartHTML we get some kind of double-encoding and end up with symbols rendered as &x1234; in a browser. What is the purpose of the encode_entities_numeric - some sanitizing similar to the regex in SmartHML above? If that's the case and if is_valid is always followed by the SmartHTML then encode_entities_numeric could probably be just removed. But I'm no expert in perl and DocDB and need advice on it.
Also is_valid is probably a confusing name as it doesn't really check that something is valid, but it seems to modify things instead.
If I get it right, this line in
SmartHTML
is supposed to escape html special characters: https://github.com/ericvaandering/DocDB/blob/4e5b406f9385862fb8f236fed1ecfa04c6ae663c/DocDB/cgi/HTMLUtilities.pm#L33-L34But it messes up non-ASCII symbols in utf8, like "é". Consider using some other utf8-friendly way for that, for instance:
Things get more complicated with this code: https://github.com/ericvaandering/DocDB/blob/4e5b406f9385862fb8f236fed1ecfa04c6ae663c/DocDB/cgi/UntaintHTML.pm#L32-L36
When chained with the
SmartHTML
we get some kind of double-encoding and end up with symbols rendered as&x1234;
in a browser. What is the purpose of theencode_entities_numeric
- some sanitizing similar to the regex inSmartHML
above? If that's the case and ifis_valid
is always followed by theSmartHTML
thenencode_entities_numeric
could probably be just removed. But I'm no expert in perl and DocDB and need advice on it.Also
is_valid
is probably a confusing name as it doesn't really check that something is valid, but it seems to modify things instead.