Open jteresco opened 3 years ago
It seems the .log files should be generated with a UTF-8 encoding.
~I tried slapping a UTF-8 byte order mark at the beginning of a .log file, and serving it from yakra.teresco.org. It was a no go.~
when generated by Python
Python & C++ flavored logs are the same.
diff <(tail -n +2 python-teresco/logs/users/$username.log) <(tail -n +2 cplusplus/logs/users/$username.log)
produces no output. Just to be sure, I'll run a Python-flavored site update on lab2 and edit this post. Edit: I see Baden-Württemberg. Python-flavored logs look good on CentOS.
or served by Apache on FreeBSD
I think this is where the issue lies.
It seems the .log files should be generated with a UTF-8 encoding.
If I slap a UTF-8 byte order mark at the beginning of the .log file, things look better, but not perfect:
http://yakra.teresco.org/logs/UTF-8/duke87.orig.log
http://yakra.teresco.org/logs/UTF-8/duke87.utf8.log
I'm getting Montréal
, Bécancour
and Québec
on CentOS as well. Note also that we have Montréal
on A-20.
Hopefully once I have a go at canqca_con.csv ~(also canqca.csv?)~ with a hex editor this will clear up too.
Edit: See https://github.com/TravelMapping/HighwayData/pull/4377
@jteresco, What do you have for AddDefaultCharset
in httpd.conf?
Seems sometimes it has problems?
Non-ascii chars also appear in siteupdate.log when listing the commented-out systems.csv lines.
In #189, the issue has come up about non-ASCII characters from updates.csv entries now getting into user .log files, and not displaying properly either when generated by Python or served by Apache on FreeBSD.
It seems the .log files should be generated with a UTF-8 encoding.