gbv / bartoc.org

Source code of BARTOC.org user interface
https://bartoc.org/
23 stars 10 forks source link

Harden stats script against potential issues #207

Open stefandesu opened 7 months ago

stefandesu commented 7 months ago

The following part of the stats script can cause issues for certain edge cases in the data:

https://github.com/gbv/bartoc.org/blob/710f202eb557bf73df38a8118dfd64f8e76fae9d/bin/stats.sh#L11

For example, I noticed that some addresses contained a tab character (\t) which caused jq to throw an error here: "jq: parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 48, column 21" I think in this particular case, the error is happening because those control characters (like \t, \n, etc.) are interpreted by Perl before being piped into jq. See also: https://github.com/jqlang/jq/issues/1049#issuecomment-598247299

I fixed the data at https://bartoc.org, so the script runs fine at the moment, but we should fix this at some point.