Open lucafavatella opened 7 years ago
A note in the documented changes of Solr 4.1.0 regarding portability of Solr across Web containers points out that "Query strings passed in via the URL need to be properly-%-escaped, UTF-8 encoded bytes, otherwise Solr refuses to handle the request". A note in the documented changes of Solr 4.5.0 mentions parametrization of encoding of query parameters by ie parameter (e.g. ie=iso-8859-1), parametrization of encoding of POST request body by Content-Type header (e.g. application/x-www-form-urlencoded; charset=iso-8859-1), and UTF-8 as the default encoding. As of Solr 4.10.4 UTF-8 is still the default encoding for both query parameters and POST request body.
ie
ie=iso-8859-1
Content-Type
application/x-www-form-urlencoded; charset=iso-8859-1
The version of yokozuna in riak kv 2.2.3 is 2.1.10 that integrates Solr 4.10.4 (see also https://github.com/basho/yokozuna/pull/709/commits/7f0d464b9190ee6db115aa4bfcd38f6407791e4a) whose documentation is available online.
Yokozuna 2.1.10 depends on riak_kv 2.1.7 that via riak_api 2.1.6 depends on basho/webmachine 1.10.8-basho1 that contains e.g. module wrq, and that depends on mochiweb v2.9.0p2 that contains e.g. module mochiweb_util.
wrq
mochiweb_util
When receiving a search request, yokozuna calls the search function, that extracts the query - percent-decoded but not further decoded e.g. Unicode - then appends some distributed search related parameters then percent-encodes (not further e.g. Unicode) the parameters and contacts Solr via POST request setting header content type to application/x-www-form-urlencoded.
search
application/x-www-form-urlencoded
As such content type header has no charset specified, Solr interprets the POST body as UTF-8.
Solr
A note in the documented changes of Solr 4.1.0 regarding portability of Solr across Web containers points out that "Query strings passed in via the URL need to be properly-%-escaped, UTF-8 encoded bytes, otherwise Solr refuses to handle the request". A note in the documented changes of Solr 4.5.0 mentions parametrization of encoding of query parameters by
ie
parameter (e.g.ie=iso-8859-1
), parametrization of encoding of POST request body byContent-Type
header (e.g.application/x-www-form-urlencoded; charset=iso-8859-1
), and UTF-8 as the default encoding. As of Solr 4.10.4 UTF-8 is still the default encoding for both query parameters and POST request body.Riak Search
The version of yokozuna in riak kv 2.2.3 is 2.1.10 that integrates Solr 4.10.4 (see also https://github.com/basho/yokozuna/pull/709/commits/7f0d464b9190ee6db115aa4bfcd38f6407791e4a) whose documentation is available online.
Yokozuna 2.1.10 depends on riak_kv 2.1.7 that via riak_api 2.1.6 depends on basho/webmachine 1.10.8-basho1 that contains e.g. module
wrq
, and that depends on mochiweb v2.9.0p2 that contains e.g. modulemochiweb_util
.When receiving a search request, yokozuna calls the
search
function, that extracts the query - percent-decoded but not further decoded e.g. Unicode - then appends some distributed search related parameters then percent-encodes (not further e.g. Unicode) the parameters and contacts Solr via POST request setting header content type toapplication/x-www-form-urlencoded
.As such content type header has no charset specified, Solr interprets the POST body as UTF-8.