SymbolicML / DynamicQuantities.jl

Efficient and type-stable physical quantities in Julia
https://ai.damtp.cam.ac.uk/dynamicquantities
Apache License 2.0
133 stars 19 forks source link

Add a scale factor dimension? #10

Closed odow closed 1 year ago

odow commented 1 year ago

We're looking into adding this to JuMP: https://github.com/trulsf/UnitJuMP.jl/issues/18

One kicker is that it'd be nice to be able to talk about micro/milli/kilo/mega etc.

Have you thought about adding a scale::Int dimension? Hopefully it'd support something like:

km = Dimension(length = 1, scale = 3)
us = Dimension(time = -1, scale = -6)
km / us == Dimension(length = 1, time = -2, scale = 9)
j-fu commented 1 year ago

I think one could use Unitful as a comprehensive database for defining units and provide an API like:

@dynquantity km us 

etc. I set up something like this in LessUnitful.jl

j-fu commented 1 year ago

After thinking more about your proposal - may be this problem can be solved differently. After seeing some discussions around the pacakge, I think for performance reasons, is critical to keep the size of a quantity small. So the way forward may be to define the basic units m, kg, s etc. as constants and define km=1000*m etc.

MilesCranmer commented 1 year ago

Thanks for the interest @odow, I'm happy to hear you are considering using it!

Just to help me understand the feature request, why couldn't you implicitly include the scale parameter inside the value of the quantity? Since the conversion from Unitful to DynamicQuantities uses upreferred, it will already propagate scales from units (milli, kilo, etc) into the value. For example:

julia> x_km = 1.5u"km"
1.5 km

julia> x = 1.5u"m"
1.5 m

julia> x_km_dyn = convert(DynamicQuantities.Quantity, x_km)
1500.0 𝐋 ¹

julia> x_dyn = convert(DynamicQuantities.Quantity, x)
1.5 𝐋 ¹
MilesCranmer commented 1 year ago

One other tricky thing about having a separate scale factor is that it is technically redundant information. You could have two quantities, say

x = Quantity(0.2, length=1, scale=2)
y = Quantity(0.02, length=1, scale=3)

and checking equality is no longer comparing x.val == y.val, but it instead would require x.val * 10^(x.scale) == y.val*10^(y.scale).

If you have a Quantity{Float64}, then the scale factor would be like if you had extra bits for the exponent. But why not just use Float128 instead? Or define another type that includes more bits for scaling?

odow commented 1 year ago

Since the conversion from Unitful to DynamicQuantities

I don't want to use Unitful. I want DynamicQuantities to be a complete replacement.

Our problem is a JuMP user writing something like (hypothetical syntax):

using JuMP, DynamicQuantities
m = Dimension(length = 1)
model = Model()
@variable(model, x, unit = m)
# User meant to type distance = 1000 * m, but actually types:
distance = 100 * m
@constraint(model, x >= distance)  # No error. Unit typo not caught

I'd like instead:

using JuMP, DynamicQuantities
m = Dimension(length = 1)
km = Dimension(length = 1, scale = 3)
model = Model()
@variable(model, x, unit = m)
distance = 1km
@constraint(model, x >= distance)  # Units don't match

as constants and define km=1000*m etc.

This still permits typos. We want to check that the scale of each variable is the same at the type level.

MilesCranmer commented 1 year ago

The key thing to note is that m is not a dimension, but rather a quantity. It should actually be defined like:

m = Quantity(1.0, length = 1)
km = 1e3 * m

But you could similarly define these with any unit system, so long as you are consistent that "kilo" is 1000x the base unit for each dimension, and so forth.

odow commented 1 year ago

and checking equality is no longer comparing x.val == y.val, but it instead would require x.val 10^(x.scale) == y.val10^(y.scale).

Correct. I guess you could have some normalization thing that always ensured the numeric value of the quantity was -10 < q < 10.

Or define another type that includes more bits for scaling?

I guess this is an open design question. It's useful to explore the space. There probably needs to be a few different attempts to see what works and what doesn't.

MilesCranmer commented 1 year ago

This still permits typos. We want to check that the scale of each variable is the same at the type level.

I think this might not be a problem since >= is not defined for the Dimension type. So this would immediately result in a type error. I think it makes sense from a physics perspective to only define >= on Quantity, because dimensions are like a coordinate system, and quantities describe the length of a vector in that coordinate system.

And, >= would trigger the result to have .valid=false if the dimensions in each quantity are incompatible. This is because it isn't possible to compare the area of a rectangle with the volume of a cube.

odow commented 1 year ago

I think this might not be a problem since >= is not defined for the Dimension type

This would be handled by JuMP. We'd want to make sure that the left-hand side had the same dimensions as the right-hand side, and if the scale was different, we'd automatically scale to correct it.

MilesCranmer commented 1 year ago

Thanks, I understand a bit better now. It still feels like a fairly niche use-case though, so I'm not sure if it should be added to the struct in DynamicQuantities or not. It would incur ~15% slowdown (7->8) for all users (if not more, due to the extra == calculation), even though it is redundant information in a Quantity.

Why not just use a Quantity object for units though? Quantity is just a Dimension plus a field for scale. You can use an arbitrary value field for Quantity, so you could put a Rational{Int} there as well if needed.

Maybe we could even have an AbstractQuantity with Quantity<:AbstractQuantity for values, and Unit<:AbstractQuantity for units. I don't see the need to put it into Dimension though. Thoughts?

odow commented 1 year ago

Thoughts?

Yeah I guess I haven't thought through the practical implementation details.

At a high level, I want the difference between mm, m, and km to be retained at the type level. So perhaps I don't even want 1m + 2km to promote to a common type. I want the user to explicitly choose. More friction, but less room for bugs.

In the near term, I'd encourage you to explore what you think is best for this package. It's useful to have a few opinionated attempts at different approaches.

odow commented 1 year ago

I thought about this a bit more. Maybe I'm asking for an orthogonal feature. I'm really asking for:

struct ScaledValue{T}
    value::T
    scale::Int8
end

and this could be completely independent of the unit/dimension/quantity discussion. It just happens to most commonly arise when discussing units.

So yes, it could go in the Quantity struct, or even a ScaledQuantity struct that we could build in JuMP outside of DynamicQuantities.

j-fu commented 1 year ago

At a high level, I want the difference between mm, m, and km to be retained at the type level. So perhaps I don't even want 1m + 2km to promote to a common type. I want the user to explicitly choose. More friction, but less room for bugs.

Hm - isn't this already what Unitful does ? I personally find this too rigid.

I think what may be missing here is what Unitful supports with upreferred. The current design of DynamicQuantities assumes that all quantities are defined relative to the SI base quantities. I personally like this, but I am sure there will be people who see this differently.

This is essentially parametrized by reference values for all base quantities, maybe something could be done with value types....

odow commented 1 year ago

isn't this already what Unitful does ?

Yeah. But Unitful has a nasty type signature, which is the reason to move to DynamicQuantities.

MilesCranmer commented 1 year ago

I think your proposed solution of:

struct ScaledValue{T}
    value::T
    scale::Int8
end

is a good idea and the best compromise to this. Unitful would indeed let you detect 1m + 1km, but DynamicQuantities makes the assumption that all quantities are in a unified representation - i.e., 1km would be eagerly converted to 1000m. This is so we can have everything be type stable and fast, but the downside is you lose information about the user's original representation.

Alternatively as @j-fu requested on #18, we could have an AbstractQuantity and AbstractDimensions so you could add custom behavior like this without resorting to type piracy. For example you could create a ScaledDimensions <: AbstractDimensions with this built-in.

j-fu commented 1 year ago

Just got the following Idea (not sure if it is good though): put the scaling factor(s) into value type parameters. This would not affect "normal" arithmetics, but allow addition of quantites with different scaling factors only with explicit conversion:

Some basic experiment:

julia> struct Y{T}
        a
        end

julia> Base.show(io::IO,y::Y{Type{Val{T}}}) where T = show(io,y.a*T)

julia> yy=Y{Type{Val{2.0}}}(3)
6.0

Base.:+(a::Y{Type{Val{T}}}, b::Y{Type{Val{T}}}) where T= Y{Type{Val{T}}}(a.a+b.a)

julia> yy+yy
12.0

julia> zz=Y{Type{Val{1.0}}}(3)
3.0
julia> yy+zz
ERROR: MethodError: no method matching +(::Y{Type{Val{2.0}}}, ::Y{Type{Val{1.0}}})

I think one would need a scaling factor for each dimension, and this could be some equivalent of Unitful.upreferred but with the advantage that it is not in global scope.

Please note that this so far is a quite spontaneous idea. Realization probably would put quite some burden on the compiler.

MilesCranmer commented 1 year ago

Closing this since #32 was merged. Now you can work directly on the units of your choice:

julia> q = 100us"cm * kPa"
100.0 cm kPa

julia> q^2
10000.0 cm² kPa²

This is ~5x slower than the regular dimensions type, but this is still much faster than Unitful, as it stores the dimensions in a type-stable sparse vector. So I recommend this symbolic form be used for user interfaces, while the expanded version used for calculations.

You can convert to regular dimensions from this symbolic dimensions with expand_units:

julia> expand_units(q^2)
1.0e6 kg² s⁻⁴

This works for constants as well. For example, the speed of light:

julia> x = us"Constants.c * Hz"
1.0 Hz c

julia> x^2
1.0 Hz² c²

julia> expand_units(x^2)
8.987551787368176e16 m² s⁻⁴