priya299 / outreachy-project

0 stars 0 forks source link

Adding data to elastic search using curl #9

Open priya299 opened 8 years ago

priya299 commented 8 years ago

Following the documentation of elasticsearch, I tried using curl method to add the json output of raw perceval and threaded info to elasticsearch. I was using bulkAPI method as the json file contains more data. As per this documentation, https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html, I created a headerline in my json output and finally when I tried adding it elasticsearch, I got the following errors. elastic_error

jgbarah commented 8 years ago

First of all, instead of screenshots you better copy and paste the output in the ticket, so that it is easy to read and cite, please.

Second, without the contents of @output_perceval.json, the only thing I can tell you is what you see in the error message: the syntax is not ok.

jgbarah commented 8 years ago

OK, I hadn't realized I had the snippet referenced in the log of out last irc meeting. Just for reference, it is at https://dpaste.de/VNC7/raw.

After looking at it, I've found the error is subtle. The key is in this text from https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html : "Because this format uses literal \n's as delimiters, please be sure that the JSON actions and sources are not pretty printed."

That means that you cannot use, as you're doing now, \n in the JSON file when you're using it for the raw interface. For example, this doesn't work when uploaded with curl:

{"index": {"_index": "perceval", "_type": "perceval", "_id": "0"}}
{
"property": "D37C879B.2A399%XXX@citrix.com"
}

But this works:

{"index": {"_index": "perceval", "_type": "perceval", "_id": "0"}}
{"property": "D37C879B.2A399%XXX@citrix.com"}

So, if you're using the raw interface, avoid those ends of lines... BTW, this is one of those areas where using the elasticsearch.py will help...

priya299 commented 8 years ago

I removed the \n and created a new output file output_perceval.json. Then when I executed curl command: curl -XPOST localhost:9200/output_perceval/bulk?pretty --data-binary @output_perceval.json I got empty file error. But when I executed cat command I can see the messages in output file. It is not empty. Error: Warning: Couldn't read data from file "output_perceval.json", this makes an Warning: empty POST. { "error" : { "root_cause" : [ { "type" : "mapper_parsing_exception", "reason" : "failed to parse, document is empty" } ], "type" : "mapper_parsing_exception", "reason" : "failed to parse, document is empty" }, "status" : 400 }

jgbarah commented 8 years ago

Sorry, I don't understand. What do you have exactly in the file output_perceval.json? What do you mean by "empty file error" and "when I executed cat command"?

In any case, the message from curl seems to be clear, the file that you're using as input seems to be empty, and the document that ElasticSearch gets in the POST is an empty document. Doing some tests, I get your message if I use a filename which doesn't exist... Please be sure that the file output_perceval.json exists, and is in the same directory where you run curl (or use its full path after "@"). Just in case, you can use "@output_perceval.json" instead of @output_perceval.json to be on the safe side...