Open jakubklimek opened 2 years ago
For what it's worth in the RDF4J documentation for the Workbench this is mentioned, and it suggests a workaround is to reconfigure Tomcat (see https://rdf4j.org/documentation/tools/server-workbench/#configuring-rdf4j-workbench-for-utf-8-support ).
In short: uncomment the setCharacterEncodingFilter
filter in conf/web.xml
of your Tomcat installation, then restart Tomcat.
~I also note that I cannot reproduce the issue, locally.~ EDIT: managed to reproduce now on a locally running Tomcat (8.5). Earlier attempt was using a recent docker image. And our docker image applies the suggested fix for POST requests in the Tomcat config.
The core of the problem seems to be in how the jQuery frontend communicates to the Workbench servlet. It uses standard form encoding to submit POST requests, and Tomcat by default uses iso-8859-1 for handling form-encoded data. The workaround suggested in the documentation is to tweak Tomcat to use UTF-8 for form-encoded data, but perhaps we should look into something where we have a little more control ourselves over the chosen character encoding.
Current Behavior
When entering a SPARQL query in Workbench (3.7.4 and 4.0.0M2) running on localhost tomcat instance (9.0.56, JDK 17.0.1), with unicode characters in it, e.g.:
I first get the info that this query will be POSTed, and then a lexical error happens, and the query gets rewritten to:
Expected Behavior
The query should execute correctly and the encoding should not be mangled.
Steps To Reproduce
No response
Version
3.7.4, 4.0.0M2
Are you interested in contributing a solution yourself?
No
Anything else?
No response