Open goretkin opened 1 year ago
I think we can do this for Iterators.map
before 2.0?
That search link no longer works: here is an updated link: https://juliahub.com/ui/Search?q=map%5C%28%5B%5E,%5D%2a,%5Cs%2a%5C%28.%2afor.%2ain&type=code&r=true
An example true positive of the search: https://github.com/zgornel/EmbeddingsAnalysis.jl/blob/e045d5c9169a29a68d66aad15297348de1863e8e/test/vocab_reduction.jl#L5
The current behavior (illustrated below) shows why changing the behavior of map
on Generator
to not also collect
can be problematic:
julia> gen1 = (fill(i, (3, )) for i = 1:4)
Base.Generator{UnitRange{Int64}, var"#21#22"}(var"#21#22"(), 1:4)
julia> gen2 = (sum(vec) for vec in gen1)
Base.Generator{Base.Generator{UnitRange{Int64}, var"#21#22"}, var"#23#24"}(var"#23#24"(), Base.Generator{UnitRange{Int64}, var"#21#22"}(var"#21#22"(), 1:4))
julia> arr2 = map(sum, gen1)
4-element Vector{Int64}:
3
6
9
12
julia> @show typeof(arr2) typeof(gen2)
typeof(arr2) = Vector{Int64}
typeof(gen2) = Base.Generator{Base.Generator{UnitRange{Int64}, var"#21#22"}, var"#23#24"}
Base.Generator{Base.Generator{UnitRange{Int64}, var"#21#22"}, var"#23#24"}
julia> eltype(arr2)
Int64
julia> eltype(gen2)
Any
An example true positive of the search:
It seems like that specific example could be fixed by introducing a method of permutedims
that takes Generator
s? After that, the ./=
operator works with generators.
I wonder if there's an efficient way to ask the compiler what the return type of a function will be, given certain argument types? If so, then eltype
could be made to work on generators.
If those two issues could be addressed, then is there any code that this fix would break?
@uniment said
I wonder if there's an efficient way to ask the compiler what the return type of a function will be, given certain argument types?
Core.Compiler.return_type
is undocumented and not totally reliable, but that's more or less what it does.
Seems to work:
julia> Base.eltype(::Type{Base.Generator{I,F}}) where {I,F} =
Core.Compiler.return_type(Tuple{F,eltype(I)})
julia> @btime eltype(i/j for i=1:4, j=1:4)
1.000 ns (0 allocations: 0 bytes)
Float64
julia> @btime eltype(ch+1 for ch='a':'z')
1.000 ns (0 allocations: 0 bytes)
Char
There's a discrepancy though, which can be fixed by special-casing Tuple
:
julia> eltype(x+1 for x=(2,2.0))
Any
julia> Base.eltype(::Type{Base.Generator{T,F}}) where {T<:Tuple,F} =
typejoin(map(P->Core.Compiler.return_type(Tuple{F,P}),Tuple(T.parameters))...)
julia> @btime eltype(x+1 for x=(2,2.0))
1.000 ns (0 allocations: 0 bytes)
Real
(I had to convert T.parameters
in a Tuple
because calling map
on a Core.svec
currently collect
s into a Vector
.)
Dunno if a specialization would be helpful or not:
julia> Base.eltype(::Type{Base.Generator{T,F}}) where {T<:NTuple{N,E},F} where {N,E} =
Core.Compiler.return_type(Tuple{F,E})
Making adjoint
work with generators:
julia> Base.reverse(itr::Iterators.ProductIterator) = Iterators.ProductIterator(reverse(itr.iterators))
julia> Base.adjoint(g::Base.Generator{I,F}) where {I<:Iterators.ProductIterator{Tuple{Any,Any}},F} =
Base.Generator(adjoint ∘ g.f ∘ reverse, reverse(g.iter))
julia> Base.adjoint(g::Base.Generator) = # for vectors
Base.Generator(adjoint ∘ g.f ∘ (((_,x),)->x), Iterators.ProductIterator((1:1,g.iter)))
Test:
julia> collect((exp(i*im) for i=1:3)') == collect((exp(i*im) for i=1:3))'
true
julia> collect((exp(i+j*im) for i=1:3, j=1:4)') == collect((exp(i+j*im) for i=1:3, j=1:4))'
true
Making permutedims
work with generators:
julia> Base.permutedims(g::Base.Generator{<:Iterators.ProductIterator}, perm) =
Base.Generator(g.f ∘ (args -> map(p -> args[p], perm)), Iterators.ProductIterator(map(p -> g.iter.iterators[p], perm)))
julia> Base.permutedims(g::Base.Generator{<:Iterators.ProductIterator}) = permutedims(g, (2, 1))
julia> Base.permutedims(g::Base.Generator) = permutedims(Base.Generator(g.f ∘ ((x,_),)->x, Iterators.ProductIterator((g.iter, 1:1))))
Test:
julia> collect(permutedims(x for x=1:3)) == permutedims(collect(x for x=1:3))
true
julia> collect(permutedims(i+j for i=1:3, j=1:4)) == permutedims(collect(i+j for i=1:3, j=1:4))
true
julia> collect(permutedims((i+j+k for i=1:3, j=1:4, k=1:5), (3,1,2))) == permutedims(collect(i+j+k for i=1:3, j=1:4, k=1:5), (3,1,2))
true
An example true positive of the search:
It seems like that specific example could be fixed by introducing a method of
permutedims
that takesGenerator
s? After that, the./=
operator works with generators.I wonder if there's an efficient way to ask the compiler what the return type of a function will be, given certain argument types? If so, then
eltype
could be made to work on generators.If those two issues could be addressed, then is there any code that this fix would break?
This has been discussed before, but I'll try to summarize here. There is a considerable extent to which the functional/behavioral semantics of Julia code should not depend on type inference, which exists almost exclusively in service of generating efficient code.
There has been a deliberate choice not to rely on Core.Compiler.return_type
in many situations that are tempting to use it. I believe map
does use it, however.
julia> Core.Compiler.return_type(sin, Tuple{Int64})
Float64
julia> Core.Compiler.return_type(sin, Tuple{Int64})
Float64
julia> sum(Vector{Float64}())
0.0
julia> map(sin, 1:0)
Float64[]
julia> sum(Base.Generator(sin, 1:0))
ERROR: MethodError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer
[...]
julia> sum(Base.Generator(identity, 1:0))
0
collect
is actually fairly complicated, and I think going down the approach of aligning the interface to Base.Generator
to look like the interface to the return value of collect
(e.g. Vector{Any}
) is more involved than it may appear.
There has been a deliberate choice not to rely on
Core.Compiler.return_type
in many situations that are tempting to use it. I believemap
does use it, however.
An argument can be made that Generator
s are exceptional enough to be granted special consideration. In #34674 it is said,
generators are said to be the lazy variant of
map
this is consistent with that view.
I think going down the approach of aligning the interface to
Base.Generator
to look like the interface to the return value ofcollect
(e.g.Vector{Any}
) is more involved than it may appear.
"The journey of a thousand miles begins with one step." - Lao Tzu
When I originally opened https://github.com/JuliaLang/julia/issues/39628 , I realized I should have made two issues, and didn't This is the non-doc portion of that issue.
Where possible, it is desirable to have properties such as
See https://github.com/JuliaLang/julia/pull/38150#issuecomment-726304995 . As that thread indicates, that property doesn't currently hold, because
map
is conflated withcollect
. To relate the behavior ofmap
andcollect
, I expect the following propertyOne example where the property doesn't hold is with
Generator
s:That property could hold with the following definitions:
This will break code that was using
map
for itscollect
behavior. Some examples (with some false positives from splatting arguments from a generator intomap
) can be found with the regexmap\([^,]*,\s*\(.*for.*in
.@jakobnissen recently made similar comments on Slack
(For v2.0)