Closed ghost closed 4 years ago
+1
GROUP BY time(1Y)
GROUP BY time(1M)
It was supposed to be included in the new flux language? Anyone can confirm that flux 0.12 that is included with influx_db 1.7.x has that capability ? I need to compare month by month 2016-2018 solar production but that can't be done easily with current query language.
I haven't tried it yet, but Flux seems to support months according to these docs
It is not yet supported in any version of Flux. We will update this issue once it does. But, this is still in plan for us to incorporate support for months, years and other human scale time ranges. We are working hard to move this forward.
+1
Hi All, I am new to influx, What if we create 2 columns(tags) with month, year? Then we can do monthly and yearly aggregation?
@pantlavanya, you could. But it will (endlessly) increase your series cardinality. Depending on your current tags, it could be a drastic or even unworkable option. As a time-series database, it's the job of Influx to handle time related functionality, at least IMO.
@pantlavanya, another limitation to this is that you would need to use tags for your month
and year
fields in order to do the grouping, and tags are strings only. So if you wanted to a month-to-date for some day in the month (say the 15th) for years passed 2015, you wouldn't be able to do something like select sum("some_field") where day_of_month < 15 and year > 2015 group by month, year
because of the string vs int comparison.
I've written some seriously hacky code on NodeJS to have aggregation windows of months and years, you can find it here. Would not recommend you run this in production, it's terrible.
+1 GROUP BY time(1M) GROUP BY time(1Y) GROUP BY time(1q) "quarterly"
order by priority
+1 GROUP BY time(1M) GROUP BY time(1Y) GROUP BY time(1q) "quarterly"
+1 GROUP BY time(1M) GROUP BY time(1Y)
It was supposed to be included in the new flux language? Anyone can confirm that flux 0.12 that is included with influx_db 1.7.x has that capability ? I need to compare month by month 2016-2018 solar production but that can't be done easily with current query language.
Tried it two days ago, and it worked for me.
Can you share a flux query that does that?
On April 28, 2019 00:25:34 eraelpeha notifications@github.com wrote:
It was supposed to be included in the new flux language? Anyone can confirm that flux 0.12 that is included with influx_db 1.7.x has that capability ? I need to compare month by month 2016-2018 solar production but that can't be done easily with current query language.Tried it two days ago, and it worked for me. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Can you share a flux query that does that? … On April 28, 2019 00:25:34 eraelpeha @.***> wrote: It was supposed to be included in the new flux language? Anyone can confirm that flux 0.12 that is included with influx_db 1.7.x has that capability ? I need to compare month by month 2016-2018 solar production but that can't be done easily with current query language.Tried it two days ago, and it worked for me. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Of course. For example, I have a counter value and this gives me the total amount of bytes received by month in a grafana table:
from(bucket: "sensordata/autogen")
|> filter(fn: (r) => r._measurement == "monitoring" and r.device == "fb01" and r.network == "TotalBytesReceived" and r._field == "bytes")
|> range($range)
|> window(every: 1mo)
|> increase()
We can't replace every influxql query with flux, it appears to be more than 10 times slower:
# time influx -database db -execute $'SELECT max("OIL_PRESSURE") FROM "db"."autogen"."main_plc" WHERE time > now()-7d AND "deviceId"=\'xx\' GROUP BY time(1d) FILL(null)'
output omitted
real 0m2.097s user 0m1.591s sys 0m0.093s
# time influx -database db -type 'flux' -execute 'from(bucket: "db")
> |> range(start: -7d)
> |> filter(fn: (r) => r._measurement == "main_plc" and r._field == "OIL_PRESSURE" and r.deviceId == "xx")
> |> window(every: 1d)
> |> max()'
output omitted
real 0m27.127s user 0m1.587s sys 0m0.084s
Update: if the implementation does what I think it does, the influxql query will read data tagged with deviceId xx from disk, whilst the flux query will read all data from disk and filter it in memory. This also means this will get progressively slower as more different deviceId tags are added. Something you need to keep in mind for your dataset.
Holy shit, that's slow.... how granular is your oil_pressure data? What hardware? My infuxdb holds now 3y of solar power power generation @ 1Hz and still grafana will pull the year graph within 5s for 2 solar generators. But at 10x that would put a lot of stress on my VM.
On 29/04/2019 09:59, Glenn Matthys wrote:
We can't replace every influxql query with flux, it appears to be more than 10 times slower:
|# time influx -database db -execute $'SELECT max("OIL_PRESSURE") FROM "db"."autogen"."main_plc" WHERE time > now()-7d AND "deviceId"=\'xx\' GROUP BY time(1d) FILL(null)'| /output omitted/
real 0m2.097s user 0m1.591s sys 0m0.093s
|# time influx -database db -type 'flux' -execute 'from(bucket: "db")
|> range(start: -7d) > |> filter(fn: (r) => r._measurement == "main_plc" and r._field == "OIL_PRESSURE" and r.deviceId == "xx") > |> window(every: 1d) > |> max()' |
/output omitted/
real 0m27.127s user 0m1.587s sys 0m0.084s
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/influxdata/influxdb/issues/3991#issuecomment-487483553, or mute the thread https://github.com/notifications/unsubscribe-auth/AFU2DKB6TD4KWC77HBKVU6TPS2TEXANCNFSM4BO7KISA.
Holy shit, that's slow.... how granular is your oil_pressure data? What hardware? My infuxdb holds now 3y of solar power power generation @ 1Hz and still grafana will pull the year graph within 5s for 2 solar generators. But at 10x that would put a lot of stress on my VM.
Hardware: 2 x Intel Xeon E5-2643v2 (6C/12T, 3.5 GHz), 128 GB RAM, data stored on Intel D3-S4510 SSD's in RAID 10 Database size: 84 GB Granularity is 10 Hz (one datapoint every 100 ms).
We haven't been focused on performance at this time. We are focused driving towards feature complete at the moment. Performance work is planned to begin soon.
I am trying the new possibility to "group by month" with flux.
I have values of a cumulated energy meter for every day and want to get the monthly energy consumption. Thus I would like to take the value at the end of the month and subtract the value of the beginning of the month.
I managed to setup the following script but face still a problem:
from(bucket: "smartfarm/autogen")
|> range(start: -1y)
|> filter(fn: (r) => r._measurement == "E_real_con")
|> window(every: 1mo, period: 1mo)
|> last()
|> group()
|> difference()
The resulting data is not grouped by calendaric month but different time intervals which you see in the screenshot (I could not copy the data from chronograf).
Is it possible to group by calendaric month (e.g. 2019-01-01 to 2019-01-31)?
Thanks!
same here, does the new version allow to group by month? (calendar)
No. This hasn't been implemented yet. It is on our backlog...and the plan is to include this with Flux (as we've stated in this thread). We absolutely understand the value and importance of getting this done.
Thanks @timhallinflux
Is there any workaround to this issue? or do we know when is going to be implemented (aprox... a year, 6 months ) ?
Thanks @timhallinflux for the clarification.
Furthermore: Is the intervals feature already implemented? I always get an error message if I try the following:
from(bucket: "smartfarm/autogen")
|> range(start: -1y)
|> filter(fn: (r) => r._measurement == "E_real_con")
|> window(intervals: intervals(every: 1mo, period: 1mo))
|> last()
|> group()
|> difference()
I get the following error message:
Two things to look at: https://docs.influxdata.com/flux/v0.24/functions/built-in/misc/intervals/ and https://docs.influxdata.com/flux/v0.24/functions/built-in/transformations/window/
I think you can simplify the window()
function as:
|> window(every: 1mo, period: 1mo)
I am trying the new possibility to "group by month" with flux.
I have values of a cumulated energy meter for every day and want to get the monthly energy consumption. Thus I would like to take the value at the end of the month and subtract the value of the beginning of the month.
I managed to setup the following script but face still a problem:
from(bucket: "smartfarm/autogen") |> range(start: -1y) |> filter(fn: (r) => r._measurement == "E_real_con") |> window(every: 1mo, period: 1mo) |> last() |> group() |> difference()
The resulting data is not grouped by calendaric month but different time intervals which you see in the screenshot (I could not copy the data from chronograf).
Is it possible to group by calendaric month (e.g. 2019-01-01 to 2019-01-31)?
Thanks!
What if you floor the start time down to the beginning of the first month instead of specifying a relative time description?
Any updates on this? It's been like four years since the initial request.
No. This hasn't been implemented yet. It is on our backlog...and the plan is to include this with Flux (as we've stated in this thread). We absolutely understand the value and importance of getting this done.
Does this mean that InfluxQL is deprecated as it seems you are no longer adding functionality to it?
@Pruxis don't count on it being fixed in a reasonable amount of time... You'll have to use the influx-fixes I posted earlier or decompose the aggregation by months into days, query influx, and then recompose the data into months.
@RedShift1 What about the changes in @Anaisdg's PR ? Will those allow to group by month and aggregate using e.g. sum ?
@konstantinblaesi That's the idea.
We can't replace every influxql query with flux, it appears to be more than 10 times slower:
I'm seeing something similar.. flux is 6 times slower doing a sum of 150k records, unworkable. 6x more hardware to finally get group by time (1mo)
and group by time (1y)
in a time series database (in which the group by time(1w)
week starts on Thursday 😕)..
Tried tagging months myself in the data, I think similar to the workaround of @Anaisdg . But then in Grafana I have to use X-Axis mode Series and stacking no longer works (https://github.com/grafana/grafana/issues/7949), so that doesn't work for me.
+1 for this feature request.
+1
@Anaisdg How can i use your Plugin? Is it in 2.0 alpha? Is it possible to update past measurements, so that all data has this tags?
@s4ndst0rm https://github.com/influxdata/telegraf/commit/3e5cfad2b051b454c50e248e544791d1aecd6586 says it's in
right?
+1 GROUP BY time(1M) GROUP BY time(1Y)
:sad:
@RedShift1 I tried flux too. When i window by month and aggregate with max function the results are little bit strange:
This is my query:
from(bucket: "openhab/autogen")
|> range(start: dashboardTime)
|> filter(fn: (r) => r._measurement == "Heizung_Total" and (r._field == "value"))
|> aggregateWindow(every: 1mo, fn: max)
|> group(columns: ["_time", "_start", "_stop", "_value"], mode: "except")
result:
#datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,double,string,string |
-- | --
#group,false,false,false,false,false,false,true,true | | | | |
#default,_result,,,,,,, | | | | | | |
,result,table,_start,_stop,_time,_value,_field,_measurement | | | |
,,0,2019-08-01T06:19:00Z,2019-10-26T06:36:41.0575593Z,2019-08-08T00:00:00Z,20402.2945,value,Heizung_Total
,,0,2019-08-01T06:19:00Z,2019-10-26T06:36:41.0575593Z,2019-09-05T00:00:00Z,20567.5373,value,Heizung_Total
,,0,2019-08-01T06:19:00Z,2019-10-26T06:36:41.0575593Z,2019-10-03T00:00:00Z,20851.7778,value,Heizung_Total
,,0,2019-08-01T06:19:00Z,2019-10-26T06:36:41.0575593Z,2019-10-26T06:36:41.0575593Z,21150.6156,value,Heizung_Total
These are values of an energy meter so that the max value is usually at last day of month. Why is the time field not at last day of month?
@s4ndst0rm The current implementation of Flux months is equivalent to 30d which is why its not behaving as expected. That said we are currently updating the implementation to use real calendar months. See https://github.com/influxdata/flux/pull/2067 to follow our progress.
With regard to the months and years not always being the same what I would expect would be to knock x off the year or month and take the first existing date BEFORE that. so: 2019-03-30 - 1M = 2019-02-28 and similarly: 2020-02-29 - 1y = 2019-02-28
@OFark this is ruby. he does really good with dates. everybody wants to be like ruby.
a = Date.parse("2019/03/30") - 1.month
=> Thu, 28 Feb 2019
Date.parse("2020/02/29") - 1.year
=> Thu, 28 Feb 2019
+1 GROUP BY time(1Y) GROUP BY time(1M) GROUP BY time(1q) "quarterly"
Thanks!
5 years ago this was opened...
This is available in Flux now.
@s4ndst0rm The current implementation of Flux months is equivalent to 30d which is why its not behaving as expected. That said we are currently updating the implementation to use real calendar months. See influxdata/flux#2067 to follow our progress.
Hey @nathanielc
I saw that you merged your code to master last year. I've updated to InfluxDB 1.7.10 but the results are the same. Is your change not available in this version?
@s4ndst0rm I assume this is only in influxdb 2.x
Yes, it's available in InfluxDB 2.0 OSS Beta and InfluxDB Cloud. This will ALSO be in the 1.8 InfluxDB release. Testing is wrapping up now. Expect an update next week.
+1, hope it will be delivered soon with 1.8 GROUP BY time(1Y) GROUP BY time(1M)
+1, same thing here to see it be delivered in v1.8
For example:
Its useful for me when i want to query the data last month and year...