influxdata / kapacitor

Open source framework for processing, monitoring, and alerting on time series data
MIT License
2.31k stars 492 forks source link

Kapacitor record query converts integer to float #1338

Open hraftery opened 7 years ago

hraftery commented 7 years ago

Consider data in measurement m like so:

time  data
  1    5
  2    6
  3    4

If data has a integer field type, then the following TICK script creates corresponding entries with integer field type when enabled and running on live data:

stream
    |from()
        .measurement('m')
    |shift(1ns)
    |influxDBOut()
        .measurement('future_m')

However, to run on historical data, one must make and replay a recording like so:

RID=$(kapacitor record query -query $'SELECT data FROM "d"."autogen"."m" WHERE time >= \'2017-04-23T00:00:00Z\' AND time < \'2017-04-24T00:00:00Z\' GROUP BY * ' -type stream)
kapacitor replay -task my_task -recording $RID -rec-time

Maddeningly, this does not work, because the replayed data converts the integer field into a float, which is not compatible with an integer field of the same name. In fact, surprisingly, it creates another field, with the same name, forever preventing the live script from working.

No manner of reordering replays/live data or even casting using ::integer in the query, makes any difference. I'm tempted to change everything to float because I understand integer is a bit of a hangover, but I have quite a bit of architecture built around these integer measurements. I just wish kapacitor would return queries the same as influx. Every time I go to do this I'm caught out by differences.

In the end I found this nasty workaround:

stream
    |from()
        .measurement('m')
    |shift(1ns)
    |eval(lambda: int("data"))
        .as('data')
    |influxDBOut()
        .measurement('future_m')

which seems to work for now.

nathanielc commented 7 years ago

@hraftery I feel you on this one, it has a been a pain point for a while. Unfortunately the work around of using eval to force the type is the only solution we can provide until InfluxDB supports a different serialization format besides JSON.