deanhiller / databus

time series data in cassandra with visualization(NREL's opensource databus project)
Other
53 stars 21 forks source link

Timestamp Support #6

Closed hopcroft closed 11 years ago

hopcroft commented 11 years ago

Add support for uploading data with timestamps in ASCII string ISO 8601 UTC basic format, e.g.:

20132406T180830Z

(Ref: http://en.wikipedia.org/wiki/ISO_8601)

Maybe throw in a few other less-common formats while you're at it (extended 8601, UTC offset, UK common (aka "Excel North America", etc).

Edit: changed to medium priority after further discussion [Priority: 4 medium]

jcollinsnrel commented 11 years ago

This feature is added in commit 3b3c343ef40e9a9c224ff1d4247cbb8d57daf01b. One note: I think your example date '20132406T180830Z' has the day and month location reversed. I've implemented it what I interpret to be the correct way: yyyyMMdd'T'HHmmss'Z'

If you dont agree with my interpretation, let me know. (I'm going to use the following as the basis for our documentation on this feature, so please let me know about any corrections...)

The mod is used on the input side like this: curl -X POST -w "\nRESULT CODE:%{http_code}\n" --insecure --header "Content-Type:application/json" --user jcollins:register:15505124196:b1:2361498898283920755 -d @ExampleData1.txt http://localhost:9000/api/dateformatV1/postdataV1 (notice the 'dateformatV1' preceding postdataV1) where the ExampleData1.txt file would look something like this: {"_dataset":[ {"_tableName":"jscteststream", "time":"20121230T070100Z", "value":2}, {"_tableName":"jscteststream", "time":"20121230T070200Z", "value":3}, {"_tableName":"jscteststream", "time":"20121230T070300Z", "value":4} ] }

for a table that has a 'time' column that is defined as a BigInteger (as would be usual for a timeseries) and a 'value' column of type BigInteger.

Retrieving the data in ISO date format looks like this (using my jscteststream data loaded above): http://localhost:9000/api/dateformatV1/rawdataV1/jscteststream/1356850860000/1356850980000 Which gives this output: { "data": [ { "time": "20121230T070100Z", "value": 2 }, { "time": "20121230T070200Z", "value": 3 }, { "time": "20121230T070300Z", "value": 4 } ], "error": "" } By default the output date pattern is the ISO date pattern "yyyyMMdd'T'HHmmss'Z'".

There are some options available with the dateformatV1 mod. dateFormat specifies the java SimpleDateFormat pattern to output: http://localhost:9000/api/dateformatV1(dateFormat=EEE%20MMM%20dd%20HH:mm:ss%20zzz%20yyyy)/rawdataV1/jscteststream/1356850860000/1356850980000 which will give output like this: { "data": [ { "time": "Sun Dec 30 00:01:00 MST 2012", "value": 2 }, { "time": "Sun Dec 30 00:02:00 MST 2012", "value": 3 }, { "time": "Sun Dec 30 00:03:00 MST 2012", "value": 4 } ], "error": "" } (unencoded that pattern is EEE MMM dd HH:mm:ss zzz yyyy) Notice the timezone here. By default you'll get the timezone of the web server, which is probably not what you want. To specify a different timezone, use the timeZone option:

http://localhost:9000/api/dateformatV1(dateFormat=EEE%20MMM%20dd%20HH:mm:ss%20zzz%20yyyy,timeZone=GMT)/rawdataV1/jscteststream/1356850860000/1356850980000

{ "data": [ { "time": "Sun Dec 30 07:01:00 GMT 2012", "value": 2 }, { "time": "Sun Dec 30 07:02:00 GMT 2012", "value": 3 }, { "time": "Sun Dec 30 07:03:00 GMT 2012", "value": 4 } ], "error": "" }

The (default) ISO date pattern is a special case since it ends with 'Z' it forces our output to GMT. If the ISO pattern is specified AND a timezone is specified, the specified timezone wins.

Finally, if you are operating on a relational (wide) table and the name of the column you want to format is not 'date' you can override the column that this mod operates on by specifying the 'columnName' option. In this case, my column is really named 'time' which is boring, so we'll use the 'relational' mod to create a column with a different name to operate on:

http://localhost:9000/api/dateformatV1(dateFormat=EEE%20MMM%20dd%20HH:mm:ss%20zzz%20yyyy,timeZone=GMT,columnName=datetime)/relational/datetime=time/rawdataV1/jscteststream/1356850860000/1356850980000

{ "data": [ { "time": 1356850860000, "value": 2, "datetime": "Sun Dec 30 07:01:00 GMT 2012" }, { "time": 1356850920000, "value": 3, "datetime": "Sun Dec 30 07:02:00 GMT 2012" }, { "time": 1356850980000, "value": 4, "datetime": "Sun Dec 30 07:03:00 GMT 2012" } ], "error": "" }

hopcroft commented 11 years ago

Great, thanks!

Let me repeat to be sure I understand- can you please correct me:

1) The table definition uses type "BigInteger" for date, even if the data that will be uploaded uses a date string, because the module "dateformatV1" will convert the string to an epoch time for storage.

2) When downloading data from the Databus storage, you can use the "dateformatV1" module to specify the format of the datestring that is downloaded. This datestring is converted from the stored value of epoch seconds.

3) Question: can the user specify the date string format during upload?