Open FelixBenning opened 1 year ago
ah right, the doctests need to be updated. That brings me to a question: The signature has suffered a bit. Should I implement a repr
[is that how this works?] to make this cleaner or do you prefer it to be explicit? I mean IdxType, ToutIndices<:AbstractVector{IdxType}
could be shortened to just ToutIndices<:AbstractVector{IdxType}
maybe. Or leave it out completely maybe I am not sure about the pros and cons
Patch coverage: 100.00
% and project coverage change: -3.69
:warning:
Comparison is base (
ef6d459
) 94.16% compared to head (12a1e05
) 90.47%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Funnily enough I managed to break the Transform
tests in exactly the same way as #510 even though this PR has nothing to do with the other PR
I can't immediately assess how useful this generalization generally is, beyond this one specific use case (which could maybe just use Ints as well?).
Whenever the output index is not linear, you are asking people to convert from a complicated index to a linear index and then back. And since you only want to pass that index on to the (custom) kernel, I don't see a reason why you would want indices to be necessarily integers.
EDIT: Also maybe you have a model with 4 output dimensions and are only interested in dimensions 1,2 and 4. So you always want those 3 dimension. At the moment you are out of luck. But with this implementation:
julia> KernelFunctions.MOInputIsotopicByFeatures(randn(3), [1,2,4])
9-element KernelFunctions.MOInputIsotopicByFeatures{Float64, Vector{Float64}, Int64, Vector{Int64}}:
(-1.4941865719611502, 1)
(-1.4941865719611502, 2)
(-1.4941865719611502, 4)
(0.11666968472865669, 1)
(0.11666968472865669, 2)
(0.11666968472865669, 4)
(-0.8766600791223175, 1)
(-0.8766600791223175, 2)
(-0.8766600791223175, 4)
and come on this is cool:
KernelFunctions.MOInputIsotopicByFeatures(rand(3), Partial.([1,2]))
6-element KernelFunctions.MOInputIsotopicByFeatures{Float64, Vector{Float64}, Partial{1}, Vector{Partial{1}}}:
(0.5167325185126124, ∂₁)
(0.5167325185126124, ∂₂)
(0.8288121708373749, ∂₁)
(0.8288121708373749, ∂₂)
(0.10051090052303036, ∂₁)
(0.10051090052303036, ∂₂)
Thinking about this more: What we are essentially implementing here is a lazy version of
collect(Iterators.product(x, 1:n))
but with random access in contrast to Iterators.product
. Of course Iterators.product
is also returning a matrix but with vec()
this would provide you the ByFeature
version. Using vec(permutedims())
would result in the ByOutput
variant.
I am thinking that the best idea might be to implement such a lazy random access product. Lazy.Product
?
using MappedArrays: mappedarray
function ensure_all_linear_indexed(vecs::T) where {T<:Tuple}
linear_indexed = ntuple(
n -> hasmethod(Base.getindex, (fieldtype(T, n), Int)),
Base._counttuple(T)
)
all(linear_indexed) || throw(ArgumentError(
"$(vecs[findfirst(x->!x, linear_indexed)]) cannot be linearly accessed. All inputs need to implement `Base.getindex(::T, ::Int)`"
))
end
function product(vectors...)
ensure_all_linear_indexed(vectors)
indices = CartesianIndices(ntuple(n -> axes(vec(vectors[n]), 1), length(vectors)))
return mappedarray(indices) do idx
return ntuple(n -> vec(vectors[n])[idx[n]], length(vectors))
end
end
function multi_out_byFeatures(positions, out_dims)
return vec(PermutedDimsArray(product(positions, 1:out_dims), (2, 1)))
end
function multi_out_byOutput(positions, out_dims)
return vec(product(positions, 1:out_dims))
end
using this we get
julia> multi_out_byFeatures(rand(3), 4)
12-element reshape(PermutedDimsArray(mappedarray(var"#18#21"{Tuple{Vector{Float64}, UnitRange{Int64}}}(([0.5128788077506103, 0.9930607101821788, 0.026060857585934016], 1:4)), ::CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}), (2, 1)), 12)
with eltype Tuple{Float64, Int64}:
(0.5128788077506103, 1)
(0.5128788077506103, 2)
(0.5128788077506103, 3)
(0.5128788077506103, 4)
(0.9930607101821788, 1)
(0.9930607101821788, 2)
(0.9930607101821788, 3)
(0.9930607101821788, 4)
(0.026060857585934016, 1)
(0.026060857585934016, 2)
(0.026060857585934016, 3)
(0.026060857585934016, 4)
julia> dump(ans)
Base.ReshapedArray{Tuple{Float64, Int64}, 1, PermutedDimsArray{Tuple{Float64, Int64}, 2, (2, 1), (2, 1), MappedArrays.ReadonlyMappedArray{Tuple{Float64, Int64}, 2, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#18#21"{Tuple{Vector{Float64}, UnitRange{Int64}}}}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}
parent: PermutedDimsArray{Tuple{Float64, Int64}, 2, (2, 1), (2, 1), MappedArrays.ReadonlyMappedArray{Tuple{Float64, Int64}, 2, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#18#21"{Tuple{Vector{Float64}, UnitRange{Int64}}}}}
parent: MappedArrays.ReadonlyMappedArray{Tuple{Float64, Int64}, 2, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, var"#18#21"{Tuple{Vector{Float64}, UnitRange{Int64}}}}
f: #18 (function of type var"#18#21"{Tuple{Vector{Float64}, UnitRange{Int64}}})
vectors: Tuple{Vector{Float64}, UnitRange{Int64}}
1: Array{Float64}((3,)) [0.5128788077506103, 0.9930607101821788, 0.026060857585934016]
2: UnitRange{Int64}
start: Int64 1
stop: Int64 4
data: CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}
indices: Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}
1: Base.OneTo{Int64}
stop: Int64 3
2: Base.OneTo{Int64}
stop: Int64 4
dims: Tuple{Int64}
1: Int64 12
mi: Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}
1: Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}
divisor: Int64 4
multiplier: Int64 -9223372036854775807
addmul: Int8 1
shift: UInt8 0x01
but product
is much more general:
julia> vec(product(rand(3), 4:5, [:a,:b]))
12-element reshape(mappedarray(var"#18#21"{Tuple{Vector{Float64}, UnitRange{Int64}, Vector{Symbol}}}(([0.20611586943058768, 0.9207289691001196, 0.8953528265724665], 4:5, [:a, :b])), ::CartesianIndices{3, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}}), 12) with eltype Tuple{Float64, Int64, Symbol}:
(0.20611586943058768, 4, :a)
(0.9207289691001196, 4, :a)
(0.8953528265724665, 4, :a)
(0.20611586943058768, 5, :a)
(0.9207289691001196, 5, :a)
(0.8953528265724665, 5, :a)
(0.20611586943058768, 4, :b)
(0.9207289691001196, 4, :b)
(0.8953528265724665, 4, :b)
(0.20611586943058768, 5, :b)
(0.9207289691001196, 5, :b)
(0.8953528265724665, 5, :b)
what do you think about this approach?
I really like the idea in this PR. Questions that arise for me are:
MOInputs
, which are very efficientFrom the snipped above with mappedarrays it seems you generate the entire 3x4 vector? This could also be implemented lazily, right?
As a dependency, MappedArrays
seems very solid and minimal.
From the snipped above with mappedarrays it seems you generate the entire 3x4 vector? This could also be implemented lazily, right?
As a dependency,
MappedArrays
seems very solid and minimal.
No MappedArrays
are lazy, i.e. they only apply the function to the underlying array once you access it. The fundamental idea here (which is not from me btw) is to use the fact that CartesianProduct is already such a lazy array
julia> CartesianIndices((1:3, 2:5))
CartesianIndices((1:3, 2:5))
julia> CartesianIndices((1:3, 2:5)) |> collect
3×4 Matrix{CartesianIndex{2}}:
CartesianIndex(1, 2) CartesianIndex(1, 3) CartesianIndex(1, 4) CartesianIndex(1, 5)
CartesianIndex(2, 2) CartesianIndex(2, 3) CartesianIndex(2, 4) CartesianIndex(2, 5)
CartesianIndex(3, 2) CartesianIndex(3, 3) CartesianIndex(3, 4) CartesianIndex(3, 5)
only that CartesianIndices((rand(3), 2:5)
is not allowed. So the product function extracts the indices from the vector-like objects and stores them in a CartesianIndices object and uses mapped arrays to (lazyly) map the indices back to the vectors.
The dump shows you the storage requirements. At the moment MOInputs would only store the vectors
part. MappedArrays additionally stores the indices (most of the time they will be OneTo(n)
so they only cost you an integer per dimension, i.e. in the special case of MOInputs 2 integer). Base.PermutedDimsArray
has no further storage requirements as it uses its Types to store the permutation. Finally Base.ReshapedArray
stores the dimension (one Integer). And a Multiplicative.Inverse
object (don't ask me why I have no idea).
So in essense this adds an O(1) storage cost of 3 Integers + the Multiplicative.Inverse
object. Which is going to be negligible in application. At the same time you reuse existing code - (except for MappedArrays) mostly Base code and gain a lot of flexibility.
I assume multi-element inputs work seamlessly as well? @Crown421
yup
julia> v
4-element ColVecs{Float64, Matrix{Float64}, SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}:
[0.19579198021872568, 0.9198508840591844, 0.23464204616834705]
[0.8852398907666711, 0.8745129006689075, 0.009698185376639912]
[0.638932589563104, 0.0116582676328858, 0.20495759713303108]
[0.853294669961563, 0.9731349258823672, 0.8970732821094968]
julia> multi_out_byOutput(v, 4)
16-element reshape(mappedarray(var"#6#9"{Tuple{ColVecs{Float64, Matrix{Float64}, SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}, UnitRange{Int64}}}((SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}[[0.19579198021872568, 0.9198508840591844, 0.23464204616834705], [0.8852398907666711, 0.8745129006689075, 0.009698185376639912], [0.638932589563104, 0.0116582676328858, 0.20495759713303108], [0.853294669961563, 0.9731349258823672, 0.8970732821094968]], 1:4)), ::CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}), 16) with eltype Tuple{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, Int64}:
([0.19579198021872568, 0.9198508840591844, 0.23464204616834705], 1)
([0.8852398907666711, 0.8745129006689075, 0.009698185376639912], 1)
([0.638932589563104, 0.0116582676328858, 0.20495759713303108], 1)
([0.853294669961563, 0.9731349258823672, 0.8970732821094968], 1)
([0.19579198021872568, 0.9198508840591844, 0.23464204616834705], 2)
([0.8852398907666711, 0.8745129006689075, 0.009698185376639912], 2)
([0.638932589563104, 0.0116582676328858, 0.20495759713303108], 2)
([0.853294669961563, 0.9731349258823672, 0.8970732821094968], 2)
([0.19579198021872568, 0.9198508840591844, 0.23464204616834705], 3)
([0.8852398907666711, 0.8745129006689075, 0.009698185376639912], 3)
([0.638932589563104, 0.0116582676328858, 0.20495759713303108], 3)
([0.853294669961563, 0.9731349258823672, 0.8970732821094968], 3)
([0.19579198021872568, 0.9198508840591844, 0.23464204616834705], 4)
([0.8852398907666711, 0.8745129006689075, 0.009698185376639912], 4)
([0.638932589563104, 0.0116582676328858, 0.20495759713303108], 4)
([0.853294669961563, 0.9731349258823672, 0.8970732821094968], 4)
How does this compare in terms of access (compute time) cost when iterating through this and storage size to the current MOInputs, which are very efficient @Crown421
Since I already addressed storage size (O(1) increase of less than 6*64bit). Now to the access performance: MOInputs
is implemented using AbstractArrays
and getindex
since there aren't any bells and whistles with manually coded broadcasting, comparing simply getindex
is the worst case (the types we use might implement these bells and whistles).
So whenever MOInput
calls getindex
the index needs to be converted into a cartesian index first (you do that with custom modulo code). Since we have a reshaped MappedArray of CartesianIndices the same happens here when we find the right entry of CartesianIndices. Then you take the (two) parts of your cartesian index and look them up:
feature = @inbounds inp.x[feature_index]
return feature, output_index
but since output_index
would be in 1:out_dim
you do not need to look that one up. The same happens here:
return ntuple(n -> vec(vectors[n])[idx[n]], length(vectors))
inside of product. ntuple
is specialized for the first 10 n, so since length(vectors)==2, this is essentially reduced by the compiler to
return (vec(vectors[1])[idx[1]], vec(vectors[2])[idx[2]])
additionally vec
will be a no-op in the cases we consider. idx[1]
represents the feature_index
idx[2]
is the output_index
.
So essentially the same happens with the caveat that you need to lookup your output axis if it can be more general than 1:out_dim
. Similarly you need to lookup your axes if the vectors you are allowed to pass do not have to be 1-based. Except for these lookups (which will be in the cache if you access getindex
repeatedly) the code is the same.
Continuing to go through this and thinking about the original motivation for this: How would this combine derivatives (encoded by symbols corresponding to differentiating by the n-th component of first/second argument) and multi-outputs (currently encoded with integers)? How would I encode that I want for example $\partial_1$ for the first output.
Continuing to go through this and thinking about the original motivation for this: How would this combine derivatives (encoded by symbols corresponding to differentiating by the n-th component of first/second argument) and multi-outputs (currently encoded with integers)? How would I encode that I want for example ∂1 for the first output.
I would encode it as
julia> vec(product( v, 1:out_dim, Partial.([1,2])))
i.e. a third coordinate
([0.19579198021872568, 0.9198508840591844, 0.23464204616834705], 1, ∂1)
([0.8852398907666711, 0.8745129006689075, 0.009698185376639912], 1, ∂1)
([0.638932589563104, 0.0116582676328858, 0.20495759713303108], 1, ∂1)
([0.853294669961563, 0.9731349258823672, 0.8970732821094968], 1, ∂1)
([0.19579198021872568, 0.9198508840591844, 0.23464204616834705], 2, ∂1)
([0.8852398907666711, 0.8745129006689075, 0.009698185376639912], 2, ∂1)
([0.638932589563104, 0.0116582676328858, 0.20495759713303108], 2, ∂1)
⋮
([0.8852398907666711, 0.8745129006689075, 0.009698185376639912], 1, ∂2)
([0.638932589563104, 0.0116582676328858, 0.20495759713303108], 1, ∂2)
([0.853294669961563, 0.9731349258823672, 0.8970732821094968], 1, ∂2)
([0.19579198021872568, 0.9198508840591844, 0.23464204616834705], 2, ∂2)
([0.8852398907666711, 0.8745129006689075, 0.009698185376639912], 2, ∂2)
([0.638932589563104, 0.0116582676328858, 0.20495759713303108], 2, ∂2)
([0.853294669961563, 0.9731349258823672, 0.8970732821094968], 2, ∂2)
EDIT: the combination of outputs and Partial derivatives was actually the motivation to think beyond two coordinates and realizing that this is related to Iterators.product
Since I am still in the prototyping phase ideas change quite a lot, so to avoid spamming this repository with PRs I created a new repository for prototyping: https://github.com/FelixBenning/DifferentiableKernelFunctions.jl I would still be interested in integrating this into KernelFunctions.jl eventually but given the unenthusiastic response to all suggestions in this direction here, it might also just stay that way. @Crown421 I sent you an invite as a collaborator since you seem interested
I think a generalization of these indices is fine as long as the code complexity does not increase too much (maybe it becomes even simpler?) and performance + AD regressions are avoided. We should also not introduce (new) type piracy and avoid too generic names (KernelFunctions should not own something as generic as product
- if the functionality is so generic, it might better be included in a different more lightweight package, so it can be reused more easily).
(I also assume that the draft above could (should?) be improved, it contains a few non-idiomatic parts.)
Apologies, I am definitely quite interested. I will check out your repo.
I think a generalization of these indices is fine as long as the code complexity does not increase too much (maybe it becomes even simpler?) and performance + AD regressions are avoided. We should also not introduce (new) type piracy and avoid too generic names (KernelFunctions should not own something as generic as
product
- if the functionality is so generic, it might better be included in a different more lightweight package, so it can be reused more easily).(I also assume that the draft above could (should?) be improved, it contains a few non-idiomatic parts.)
Since mappedarrays caused issues too at some point I created
https://github.com/lazyLibraries/ProductArrays.jl
with this you can do
julia> A = productArray(1:3, (:a,:b))
3×2 ProductArrays.ProductArray{Tuple{UnitRange{Int64}, Tuple{Symbol, Symbol}}, Tuple{Int64, Symbol}, 2}:
(1, :a) (1, :b)
(2, :a) (2, :b)
(3, :a) (3, :b)
julia> vec(A)
6-element reshape(::ProductArrays.ProductArray{Tuple{UnitRange{Int64}, Tuple{Symbol, Symbol}}, Tuple{Int64, Symbol}, 2}, 6) with eltype Tuple{Int64, Symbol}:
(1, :a)
(2, :a)
(3, :a)
(1, :b)
(2, :b)
(3, :b)
with an application of PermutedDimsArray
this does everything that the current MOInput helpers do. You could still write a wrapper to avoid this chain of three commands but it will be much shorter.
The package should be registered soon https://github.com/JuliaRegistries/General/pull/84683 (I'll also try to add more tests using OffsetArrays and more weird types) but the standards are coverd.
Apologies, I am definitely quite interested. I will check out your repo. @Crown421
my comment was not at all directed at you and I was probably unfair to @devmotion too. Thank you for your time. I was a bit emotional about my pet project, sorry.
Summary
Enable general output indices, e.g. symbols:
The use case for this is going to be derivative operators [:none, $\partial_1$, ... $\partial2$, $\partial{11}$ ...] cf. #508
Proposed changes
Replace
by
To ensure backwards compatibility introduce a Constructor taking indices
The use of
Base.OneTo
ensures equivalent performance.Breaking changes None