taraslayshchuk / es2csv

Export from an Elasticsearch into a CSV file
Apache License 2.0
510 stars 191 forks source link

timestamp range #4

Closed rdica closed 8 years ago

rdica commented 8 years ago

Thanks for this tool, works for my needs. I'm having a problem trying to specify a range using Lucene syntax on a field called 'timestamp', the format is Jan 20 11:33:40. My query string is

'status: bounced AND timestamp: ["Jan 20 00:00:00" TO "Jan 20th 12:33:00"]'

and this returns 0 results, but I see results in kibana for this range.

Google ssearches on Lucene range syntax state:

field: [value1 TO value2]

What am I doing wrong?

taraslayshchuk commented 8 years ago

Hi!

You should observe in which datetime format your 'timestamp' field stored in Elasticsearch. After that you could simply generate proper string query. The easiest way is run query with arg '-m 1' to get only one doc and look into output file to determine field format. Example:

taraslayshchuk@MBP-Taras ~ $ es2csv -i blog -q '*' -o file.csv -m 1
Found 1000 results
Hit max result limit: 1 records                          ] [0/1000] [  0%] [0:00:00] [ETA:  --:--:--] [  0.00  docs/s]
Write to csv [##############################################] [1/1] [100%] [0:00:00] [Time: 0:00:00] [  2.06 klines/s]

taraslayshchuk@MBP-Taras ~ $ cat file.csv 
email,first_name,gender,id,ip_address,last_name,timestamp
mwashington2@forbes.com,Marie,Female,3,74.73.42.155,Washington,2015-06-28T11:29:23Z

taraslayshchuk@MBP-Taras ~ $ es2csv -i blog -q 'timestamp: [2015-06-28T11:29:23Z TO 2015-07-28T11:29:23Z]' -o file.csvFound 89 results
Run query [################################################] [89/89] [100%] [0:00:00] [Time: 0:00:00] [  3.46 kdocs/s]
Write to csv [############################################] [89/89] [100%] [0:00:00] [Time: 0:00:00] [  9.97 klines/s]

taraslayshchuk@MBP-Taras ~ $ es2csv -i blog -q 'timestamp: [2015-06-28T11:29:23Z TO 2015-06-28T12:29:23Z]' -o file.csvFound 1 results
Run query [##################################################] [1/1] [100%] [0:00:00] [Time: 0:00:00] [123.47  docs/s]
Write to csv [##############################################] [1/1] [100%] [0:00:00] [Time: 0:00:00] [  2.42 klines/s]

taraslayshchuk@MBP-Taras ~ $ es2csv -i blog -q 'timestamp: [2015-06-28T11:29:23Z TO 2015-06-29T12:29:23Z]' -o file.csvFound 6 results
Run query [##################################################] [6/6] [100%] [0:00:00] [Time: 0:00:00] [  1.29 kdocs/s]
Write to csv [##############################################] [6/6] [100%] [0:00:00] [Time: 0:00:00] [  7.90 klines/s]

taraslayshchuk@MBP-Taras ~ $ cat file.csv 
email,first_name,gender,id,ip_address,last_name,timestamp
mwashington2@forbes.com,Marie,Female,3,74.73.42.155,Washington,2015-06-28T11:29:23Z
tortizku@samsung.com,Terry,Male,751,30.9.115.76,Ortiz,2015-06-29T05:27:07Z
spowellfg@cargocollective.com,Scott,Male,557,178.252.107.4,Powell,2015-06-28T23:25:25Z
twebbh4@go.com,Tammy,Female,617,120.196.137.243,Webb,2015-06-29T07:58:39Z
abowmanbj@whitehouse.gov,Aaron,Male,416,223.35.117.96,Bowman,2015-06-28T19:34:46Z
dhanson1b@phpbb.com,Doris,Female,48,106.254.230.45,Hanson,2015-06-28T13:15:30Z
rdica commented 8 years ago

Thanks for responding, per your command, the timestamp format is:

Jan 20 18:41:23

However, I also have a field @timestamp, in the format:

2016-01-20T00:00:23Z

Seems I can query a range using this, so I will do so, thanks for your help!

taraslayshchuk commented 8 years ago

If your field format really is 'Jan 20 18:41:23' you have field type 'string'. You can check it in mapping:

taraslayshchuk@MBP-Taras ~ $ curl -XGET 'localhost:9200/blog/_mapping'
{"blog":{"mappings":{"post":{"properties":{"email":{"type":"string"},"first_name":{"type":"string"},"gender":{"type":"string"},"id":{"type":"string"},"ip_address":{"type":"string"},"last_name":{"type":"string"},"timestamp":{"type":"date","format":"strict_date_optional_time||epoch_millis"}}}}}}

Range cannot be between two strings. You have to create new index with proper mapping for your field, have to upload docs and after that you could run query in date time format as it described in your mapping.

You welcome!