fsanaulla / chronicler-spark

InfluxDB connector to Apache Spark on top of Chronicler
Apache License 2.0
27 stars 4 forks source link

How to explicitly specify a time for each data point? #26

Closed rabejens closed 2 years ago

rabejens commented 3 years ago

I have files containing measured data with explicit timestamps. How to keep them when saving the data points?

fsanaulla commented 3 years ago

Hey @rabejens,

Did you try to use timestamp annotation?

rabejens commented 3 years ago

How do I use this with Spark?

Do I simply have to generate a case class like this:

case class Foo(@timestamp time: Long, @tag channelid: String, @field value: Long)

and then something like this:

val df = (...) // get some DF/DS
df.as[Foo].saveToInfluxDBMeas("mydb", "mymeasurement")

or do I have to do something else?

fsanaulla commented 3 years ago

Yes, you're completely right. Few notes:

Try It and let me know if you'll have any issues

rabejens commented 3 years ago

I currently have to do other things but as soon as I come to it I'll try. Our timestamps are already epoch timestamps in nanoseconds .

fsanaulla commented 3 years ago

Explicitly specifying @epoch timestamp will generate you faster writers in general

rabejens commented 3 years ago

I am getting a lot of errors like this:

unable to parse 'MyMeasurement,measuringSystem=MyMeasuringSystem,subSystem=MySubSystem,channelId=SOMEID,channelName=SomeChannel,unit=°C ,value=26.100000381469727 1626870870017000000': invalid field format

My case class looks like this:

case class DataRecord(@epoch @timestamp time: Long,
                      @escape @tag measuringSystem: String,
                      @escape @tag subSystem      : String,
                      @escape @tag channelId      : String,
                      @escape @tag channelName    : String,
                      @escape @tag unit           : String,
                      @field value: Double)

The log of my server says:

[httpd] 10.244.3.0 - username [21/Jul/2021:12:40:06 +0000] "POST /write?db=mydb&p=%5BREDACTED%5D&u=username HTTP/1.1 " 400 822 "-" "requests-scala" c73adac3-ea20-11eb-8079-0ef32a0feffd 91

(much more like this)

What am I missing or doing wrong?

Follow-up

I tried to send this record using cURL and I got the same error from my InfluxDB server. When I removed the comma before value, it worked. So how can I prevent Chronicler to add the comma?

Follow-up 2

I posted this as a bug report in the main Chronicler repository

fsanaulla commented 2 years ago

Solved here