I am struggling over what really seems to be a bug.
In my situation I have two streams:
stream A having the energy used by a machine, sampled 2880 times per day;
stream B having the energy cost sampled once per day;
In order to compute the cost at any of the 2880 timings per day (I need a time series for it for other purposes) what I do is:
I create a '_day' column for both the stream A and B storing the day timestamp, I then group both by this new '_day' column;
I then take a LEFT join on stream A and B over the '_day' column in order to obtain a unique stream and be able to compute the energy cost for any of the timings;
This approach works indeed, the very subtle problem is that if you have more than N>2000 rows for stream A, for some reason the joined table will return N-1000 rows.
So for a single day the joined table will return 1880 rows instead of 2880! If i limit the stream A to 2000 rows I will get 2000 rows back, but if I use 2001 rows I will get back 1001 rows.
In this image _value -> is the count of rows.
Thanks to anyone helping me.
Steps to reproduce:
List the minimal actions needed to reproduce the behavior.
Create a stream A with at least 2001 rows per day;
Add the _day column to stream A containing the day timestamp -> date.truncate(t: r._time, unit 1d)
Create a stream B with 1 row per day;
Add the _day column to stream B containing the day timestamp -> date.truncate(t: r._time, unit 1d)
Take a letf/full join on the _day column
Expected behavior:
Get back a table having the same row numbers than the stream A.
Actual behavior:
Get back a table having the same row numbers than the stream A only if row_numbers <= 2000.
Environment info:
System info: InfluxDB OSS running on docker container
Hi Everyone,
I am struggling over what really seems to be a bug. In my situation I have two streams:
In order to compute the cost at any of the 2880 timings per day (I need a time series for it for other purposes) what I do is:
This approach works indeed, the very subtle problem is that if you have more than N>2000 rows for stream A, for some reason the joined table will return N-1000 rows.
So for a single day the joined table will return 1880 rows instead of 2880! If i limit the stream A to 2000 rows I will get 2000 rows back, but if I use 2001 rows I will get back 1001 rows.
In this image _value -> is the count of rows.
Thanks to anyone helping me.
Steps to reproduce: List the minimal actions needed to reproduce the behavior.
Expected behavior: Get back a table having the same row numbers than the stream A.
Actual behavior:
Get back a table having the same row numbers than the stream A only if row_numbers <= 2000.
Environment info: