Closed danielhers closed 5 years ago
truth be told, i would be a bit sad to back out of a perfectly healthy unicode solution, 'just' because some of the pages served by CodaLab end up being displayed (in at least some browsers) with the wrong encoding. it almost seems that the public CodaLab instance (hosted in france, for all i can tell) serves up with an ISO-8859-1 header, while the mtool outputs end up being encoded as UTF-8.
before we give up on unicode in mtool, could we try to force the right header on CodaLab:
https://www.w3.org/International/questions/qa-htaccess-charset
if that failed, we could still try to generate the files in ISO-8859-1 instead, presumably by setting LANG to something like 'en_US.iso88591' in the CodaLab environment that executes the validator?
I don't think we have access to .htaccess
on CodaLab. At least the interface I am aware of is only through the HTML menus, not by terminal access to the server. We could try to replace the docker image used for the validator ("scoring program"), but I think that might be an overkill.
OK, I added <meta charset="UTF-8">
to the "detailed results" page in the evaluation output (https://github.com/cfmrp/codalab/commit/1413954d71f6ce0c4f3dd3ef5bfb89dab5267d23), so now it shows correctly there at least. The stderr report still seems to use ISO-8859-1?
I'm just replacing the quotes when printing to stderr (https://github.com/cfmrp/codalab/commit/ac37aff006f2a0daccfc010e47ffe9574c1f20b3).
They show as ‘’ in HTML.