emmt / LazyAlgebra.jl

A Julia package to extend the notion of vectors and matrices
Other
17 stars 1 forks source link

Type `Floats` cannot support Unitful vectors #4

Open JeffFessler opened 2 years ago

JeffFessler commented 2 years ago

The Floats type used here is a union of AbstractFloats and Complex and is a very sensible type for the goals of this package. Unfortunately that union is not quite general enough to include Unitful vectors because such vector's elements are subtypes of Number not of AbstractFloat. So simple methods like vscale! do not apply as currently written. MWE:

using LazyAlgebra
using Unitful: m
x = ones(5) * 1m
vscale!(x, 2)

ERROR: MethodError: no method matching vscale!(::Vector{Quantity{Float64, š‹, Unitful.FreeUnits{(m,), š‹, nothing}}}, ::Int64, ::Vector{Quantity{Float64, š‹, Unitful.FreeUnits{(m,), š‹, nothing}}})

It probably would be a major undertaking to support Unitful vectors here, so I am not really advocating for that. But I am just reporting the issue here because it is relevant to another PR elsewhere:

https://github.com/emmt/LinearInterpolators.jl/pull/9

emmt commented 2 years ago

I'd be very happy to enlarge the possible types for the "values" manipulated in LazyAlgebra (and thus in related packages). Perhaps we can elaborate on the rationale of this extension. LazyAlgebra was mostly designed for encoding and solving inverse problems, so there is probably some bias in the points listed below. If we consider values with units, the following notions are central to LazyAlgebra:

For efficiency reasons (large scale problems) value types should be homogeneous (the same concrete type for all values in a given variable or mapping). Values may be real/complex but Unitful values are just Number regradless of the numerical type of the value, so we could have unions like the ones below to cope with that (I still have to check whether this could work but you get the idea):

const LazyReals = Union{AbstractFloat,AbstractQuantity{<:AbstractFloat}}
const LazyComplexes = Complex{<:LazyReals}

Maybe something like that is sufficient (after all, Unitful implements assertions to make sure one does not add "apples" to "bananas"). My fear is that, I do not see an easy way to cope with units multipliers (kilo-, micro-, etc.) indeed:

using Unitful
x = 1.0u"kg"
y = 1.0u"g"
x.val  # yields 1.0
y.val  # yields 1.0
(x/y).val # yields 1.0
uconvert(Unitful.NoUnits, x/y) # corretly yields 1000.0
uconvert(Unitful.NoUnits, y/x) # corretly yields  0.001

which (for me) means that we may have to explicitely call uconvert. But maybe I am wrong, for example:

H = alpha*A + beta*B

with A and B two mappings, is consistent, even though A and B may have different output units, provided the multipiers alpha and beta have the correct units. This is only possible if units are homogeneous in arrays.

So perhaps we can just give a try to allow for values with units. Methods promote_multiplier, etc. will have to be adapted.

JeffFessler commented 2 years ago

Thanks for the detailed notes. I too am interested in inverse problems and I agree with nearly everything you wrote above for the same reasons. (The only exception is that in MRI work I truly use complex numbers rather than thinking of them as pairs of reals and I use the usual complex-valued inner product most of the time. But that is a minor point here.) I especially agree that all values in an array should have the same eltype including both units and precision!

I have never used x.val for Unitful variables - I didn't even know it exists. When I work with such variables I go "all in" with units and keep them all the way through, even to the point of plotting thanks to UnitfulRecipes. See for example this sinogram with units for both the axes and the values: https://juliaimagerecon.github.io/ImagePhantoms.jl/stable/examples/2-ellipse/#Radon-transform-using-radon Now I suppose to write the final output to a file or to pass it to python or matlab one would likely need to use uconvert, but I would hope that uconvert calls would not be needed internally in LazyAlgebra.

The Union approach you describe is where I started with some of my packages, but it has the drawback of adding another package dependency on Unitful. If that extra overhead does not worry you then certainly it is fine with me. I also don't know if Unitful will reign indefinitely as the go-to Units package... What I've done instead mostly is a kind of "duck typing" where I define const RealU = Number and then I type a lot of function arguments like Array{<:RealU} to serve as a reminder to myself (and users) that the code is expecting either an array of Real numbers or an array of real numbers with units, without actually enforcing that requirement. Users will get errors if they pass arrays of, say, complex numbers to many such routines. Anyway, totally up to you and I'm happy to help if you want.

emmt commented 2 years ago

You are right, it is only in optimization code that complexes should be considered as pairs of floats. The inner product (vdot in LazyAlgebra) should be implemented differently in LazyAlgebra which is more designed for direct modeling and, say, in OptimPackNextGen which is dedicated to numerical optimization of large scale problems.

I really like your duck-typing approach, type assertions should be relaxed to exploit one of the power of Julia that the same code can serve to many different argument types (I was probably too much influenced by my experience in C coding). I agree that it makes sense to let Julia complain when argument types are incompatible.

Below is a demonstration that relaxing types to any Number (including Unitful ones) as you suggested is easy doable:

function vdot1(x::AbstractArray{Tx,N}, y::AbstractArray{Ty,N}) where {Tx,Ty,N}
    s = zero(promote_type(Tx, Ty))
    @inbounds @simd for i in eachindex(x, y); s += x[i]*y[i]; end
    return s
end

function vdot2(x::AbstractArray{Tx,N}, y::AbstractArray{Ty,N}) where {Tx,Ty,N}
    s = zero(typeof(oneunit(Tx)*oneunit(Ty))) # there are probably simpler expressions that yield this
    @inbounds @simd for i in eachindex(x, y); s += x[i]*y[i]; end
    return s
end

using Unitful
T, n = Float32, 5_000
x=rand(T,n).*1u"kg"
y=rand(T,n).*1u"m"
vdot1(x,y) # throws "ERROR: ArgumentError: zero(Quantity{Float32, D, U} where {D, U}) not defined."
vdot2(x,y) # yields a correct result in "kgā‹…m"

The only thing to adapt is the initiallization of the result (here the accumulation variable s) to the correct type. The promote_type method fails here but the replacement is easy to write. Besides, the vdot2 code does not even know the Unitful package which avoids the dependency issue that you mentioned.

I benchmarked that the @simd durective was really effective here: 306ns for 5,000 entris that a pretty nice 32.7 Gflops. This means that my "type enforcing" policy is not needed to execute highly optimized code (I also checked that vdot2 also worked for non-standard abstract arrays like view(x,1:3:n) without complaining).

The only remaining issue I can anticipate is the case of multipliers. To not penalize optimizations, these factors are supposed to not enforce their precision but instead follow that of the quantities multipled by them. We have to design a new implementation of promote_multipler that works with an extended range of scalar types. Using Require, the method can be automatically extended for, say, Unitful types when this package is loaded. So this could be done without depending on Unitful or equivalent (just Require). Checks that multipliers have special values (0, Ā±1, etc) have to be modified, but I guess that this should be as simple as the replacement for promote_type above. For example Ī± == 0 and Ī± == 1 should be replaced by Ī± == zero(Ī±) and Ī± == oneunit(Ī±). Other impacted methods are probably vcreate and output_type. But that may not a big deal...

JeffFessler commented 2 years ago

All looks good!