Closed bramtayl closed 5 years ago
This is an interesting one. I had left these off ignored these for now because I can see two good choices here.
identity
like you have.propertynames
.The thing with 1. is that it basically makes innerjoin
behave a bit like intersect
by default (at least for inputs with distinct values). I'm trying to imagine a use case!
The thing with 2. is that it's quite a lot less generic than the other functions here. It's more Tables.jl centric. But, in that particular use case, it is quite compelling. I could use naturaljoin
and the operator ⨝
for this, so not to affect innerjoin
.
Thoughts?
What about a match function called ‘same’ (which tests for equality only at shared propertynames)
Ok, so here's a start:
import Base: diff_names
function intersect_names(data1, data2)
data1_names = propertynames(data1)
data2_names = propertynames(data2)
Names{diff_names(data1_names, diff_names(data1_names, data2_names))}()
end
default_key(rows1, rows2) = intersect_names(first(rows1), first(rows2))
Ok, I added in_common
back to LightQuery, and if its useful it might make it into the next version. So you could add innerjoin(left, right) = innerjoin(in_common(first(left), first(right)), left, right)
. Of course, the default would only work for non-empty iterators.
PS the argument order of inner_join is a bit off. Why not make all of these functions keyword arguments instead?
It’s a good idea. The signature was prototyped in Julia v0.6 so the keyword arguments weren’t fast yet.
I also wonder if we have the same problem with too many positional function arguments in mapreduce
...
Of course group
and so-on might also benefit from keyword arguments.
Codecov Report
28.57% <0%> (-64.77%)
43.75% <0%> (-56.25%)
48.75% <0%> (-51.25%)
52.17% <0%> (-47.83%)
60.71% <0%> (-39.29%)
67.67% <0%> (-30.86%)
70.83% <0%> (-23.62%)
72% <0%> (-22.74%)
90% <0%> (-10%)
Continue to review full report at Codecov.