elastic / stream2es

Stream data into ES (Wikipedia, Twitter, stdin, or other ESes)
355 stars 62 forks source link

Wrong Content-Type send on requests #72

Open damienalexandre opened 6 years ago

damienalexandre commented 6 years ago

This tool try to run queries to Elastic with this Content Type:

Content-Type: text/plain; charset=UTF-8

This is wrong as it should be application/json; charset=UTF-8. This also make this tool incompatible with Elasticsearch 6:

java -jar stream2es wiki --target 'http://elastic:passwordedited@localhost:9200/wikipedia'

clojure.lang.ExceptionInfo: clj-http: status 406 {:request-time 2, :repeatable? false, :streaming? true, :chunked? false, :headers {"content-type" "application/json; charset=UTF-8"}, :orig-content-encoding nil, :status 406, :length -1, :body "{\"error\":\"Content-Type header [text/plain; charset=UTF-8] is not supported\",\"status\":406}", :trace-redirects ["http://elastic:passwordedited@localhost:9200/wikipedia"]}

workmanw commented 6 years ago

@damienalexandre did you happen to find a workaround to this?

workmanw commented 6 years ago

For anyone else who ran into this issue, I was able to workaround it using an NGINX proxy. I know it's a hack, but it's an easy workaround.

Proxy config:

server {
  client_max_body_size 50M;
  listen 9999;

  location / {
    proxy_pass http://localhost:9200;
    proxy_set_header Content-Type application/json;

stream2es command:

./stream2es wiki --max-docs 5 --source ~/enwiki-latest-pages-articles.xml.bz2 --target http://localhost:9999/wiki
martin-g commented 5 years ago

Even with the Nginx workaround by @workmanw it fails for me on the next step with:

clojure.lang.ExceptionInfo: clj-http: status 400 {:request-time 21, :repeatable? false, :streaming? true, :chunked? false, :headers {"Server" "nginx/1.14.0 (Ubuntu)", "Date" "Thu, 20 Dec 2018 14:57:34 GMT", "Content-Type" "application/json; charset=UTF-8", "Content-Length" "867", "Connection" "close"}, :orig-content-encoding nil, :status 400, :length 867, :body "{\"error\":{\"root_cause\":[{\"type\":\"illegal_argument_exception\",\"reason\":\"unknown setting [index.creation_date] please check that any required plugins are installed, or check the breaking changes documentation for removed settings\"}],

I have the feeling this tool does not support newer Elasticsearch (6.4.2 and 6.5.3 in my case).