Open jpacik opened 6 years ago
Tasks could really use some of the other joins for retrieving task run logs. I think we can work around it in Go, but it would be a lot simpler if we could do it in straight Flux.
There is a discrepancy between the documentation and the actual product.
The documentation says:
method
The method used to join. Defaults to "inner". Data type: String Possible Values:
- inner
- cross
- left
- right
- full
But when using anything other than inner
, we get an error, such as
error calling function "join": left is not a valid join type
I think the documentation should be updated to reflect the fact that only inner
is currently supported.
Any chance this issue will get some attention in the near future? A left join would be very handy for my use case.
@legomind I'll try and get this prioritized for you. We currently have some other things in progress at the moment, but I'll let you know when we start work on this.
@legomind Could you share your use case? Seeing an example Flux script of what you are trying to do along with the data schema you are using would be very helpful. Even if the Flux script doesn't work, just seeing how you would expect it to would is great.
@jlapacik Thanks that would be awesome. @nathanielc My use case is pretty simple, but highly specific. I have two energy meters one at the circuit breakers and one attached to a dehumidifier. The circuit breaker energy monitor reports a measurement for most of the breakers as well as the total house use. It also reports the difference between all the monitored individual breakers (all combined) and the total usage, leaving me with the combined energy use of the remaining, unmonitored breakers (called 'other' in this example).
My second energy monitor is attached directly to the dehumidifier and the measurements are recorded to another db.
My goal is to take subtract the dehumidifiers energy use from the 'other' measurement. The following script works fine for me until the dehumidifier in turned off, then of course, there is nothing to match to, and the rest of the graph data is omitted.
I am very new influxdb and any suggestions or improvements will be welcome.
other_total = () => {
other = from(bucket: "iotawatt/autogen")
|> range($range)
|> filter(fn: (r) => r._measurement == "other_total" and r._field == "value")
|> aggregateWindow(every: $__interval, fn: mean)
|> keep(columns: ["_value", "_time"])
d = from(bucket: "home_assistant")
|> range($range)
|> filter(fn: (r) => r._measurement == "switch.living_room_dehumidifier" and r._field == "power_consumption")
|> aggregateWindow(every: $__interval, fn: mean)
|> keep(columns: ["_value", "_time"])
return join(tables: {o:other, d:d}, on: ["_time"])
|> map(fn: (r) => ({
_time: r._time,
_value: r._value_o - r._value_d,
_measurement: "Other", }))
}
other_total() |> yield(name: "Other")
@legomind so if the dehumidifier is turned off for a period of time, currently join
would not produce any rows for that time range. But you would like it to still produce rows for the the other
measurement, even if the dehumidifier is turned off? Am I understanding that correctly?
Correct.
I also got tripped up by this one trying to use left join with a very similar use case to @legomind - in my case I am trying to drop one series to 0 when another 'cutter' series is > 100, and am getting infrequent data from the 'cutter' series and would like to try with a left join.
// join(tables: {tt2:tt2, pt:pt}, method: "left") // error in evaluating AST while starting program: error calling function ""map"": error calling function ""join"": left is not a valid join type
join(tables: {tt2:tt2, cut:cut}, on: ["_time", "_stop", "_start"])
|> map(fn: (r) => ({
_time: r._time,
_value: if r._value_cut >= 100.0 then 0.0 else r._value_tt2
}))
NOTE: I am also fighting with combinations of
|> aggregateWindow(every: 1m, fn: mean, createEmpty: false)
vs
|> aggregateWindow(every: 1m, fn: mean, createEmpty: true)
and using |> fill(column: "_value", usePrevious: true)
to try and fill in the cutter
series so I can use the inner join but seems like that is not working as expected yet.
@metareason Here is to hoping this is addressed soon with the addition of left outer joins.
Just wondering if there was any movement on this. I am more than willing to test if needed.
We will be addressing join's deficiencies at some point in the future. In the meantime we have added a new join
function to the internal/promql
package with a type signature that more closely reflects what we want for the current join.
Note, promql.join
is a specialized inner join function that can also perform left and right semi joins as well. Here's the doc https://github.com/influxdata/flux/blob/master/stdlib/internal/promql/join.md.
Any chance for left
joins in the near future? This lack of functionality is basically showstopper for some of my usecases.Since when there is really no data in requested range, even fill
function can't help
I am also not able to use promql.join
the API says unknown import path: ""internal/promql""" "
I recently had a use case where there are two similar streams, each containing at least a URL as a key. Some of the URLs are in both streams and other are in only one.
I tried to join()
these and the resulting stream contained only the URLs that appeared in both streams, an inner join
.
I needed the list to contain all of the URLs from both streams, an outer join
.
Hi there, I've recently elaborated further on this topic concerning Joining in the Flux query language. In the article below you can find a tutorial on how to perform a (full) outer join in flux applied on a realistic use case accompanied with some side notes about the previous lack of a decent join implementation: https://dzone.com/articles/how-to-pivot-and-join-time-series-data-in-flux
This is for Flux (InfluxDB V2)
In the UI, documentation on join() function includes:
method:String
The method used to join. Possible values are: inner
, cross
, left
, right
, or full
. Defaults to "inner"
.
However when I set method to left, I receive an error that left is not a valid join type
"error calling function "join" @20:1-23:2: left is not a valid join type"
Seems this has been an issue for a couple years now. Will this ever be resolved? I really need to be able to perform a left join - part of my POC of using InfluxDB.
Hi @DLCrosby, Checkout my comment above referring to the article on Dzone. At the end of the article, you can find a solution on how to perform an outer join (which will give you a similar result as you are looking for when using a left join).
@lavensj I will look at that again, however the issue remains - Documentation shows other joins besides "inner", however only "inner" works. When will the product be fixed to support additional join types? Given this was first brought up years ago, it is supremely disappointing that the only answer is a complex workaround.
Hi, any plans to implement outer joins short termed? Using unions, sorts, limits, aso are very ressource consuming operations to get a result. Thanks in advance.
This would solve so many problems im having
4 years after and still not implemented?
Are there any plans to implement other types of joins? When will it be done? I'm looking very much forward to this feature.
@Loeni01 Work is happening on adding new join types right now. I don’t have an exact timeframe for when they’ll be available, but it will be soon. They will be a part of a new join
package. Here’s the Initial spec for the package: https://github.com/influxdata/flux/issues/4629
Thanks @sanderson for the update, you can follow our progress on this issue here where we have broken out the various steps https://github.com/influxdata/flux/issues/3459
Until the full join functionality is implemented, a good workaround is to use the combination of the union() and pivot() functions.
Add the following code to final output stream for any table to be used in the downstream union
operation:
table_1 = from(bucket: "some_bucket_1") |> range(start: -1d) |> map(fn: (r) => ({ join_column_name: r.join_column_name, _field: "choose_new_column_name_1", _value: column_value_1 })) |> group()
table_2 = from(bucket: "some_bucket_2") |> range(start: -1d) |> map(fn: (r) => ({ join_column_name: r.join_column_name, _field: "choose_new_column_name_2", _value: column_value_2 })) |> group()
union(tables: [table_1, table_2]) |> group() |> pivot(rowKey:["join_column_name"], columnKey: ["_field"], valueColumn: "_value") |> group()
Currently join implements the inner join operation. Join needs to support the following types of joins as well: