SymbolicML / DynamicQuantities.jl

Efficient and type-stable physical quantities in Julia
https://symbolicml.org/DynamicQuantities.jl/dev/
Apache License 2.0
120 stars 15 forks source link

Allow External Unit Registration #107

Closed ven-k closed 4 months ago

ven-k commented 6 months ago

Fixes:

julia> DynamicQuantities.Units.@register_unit Wb "m^2*kg*s^-2*A^-1"
ERROR: UndefVarError: `_UNIT_SYMBOLS` not defined
Stacktrace:
 [1] top-level scope
   @ C:\Users\user\.julia\packages\DynamicQuantities\5QflN\src\units.jl:25
github-actions[bot] commented 6 months ago

Benchmark Results

main 92df2f428564fb... t[main]/t[92df2f428564fb...]
Quantity/creation/Quantity(x) 2.79 ± 0.009 ns 3.11 ± 0.01 ns 0.9
Quantity/creation/Quantity(x, length=y) 3.41 ± 0.01 ns 3.11 ± 0.001 ns 1.1
Quantity/with_numbers/*real 3.11 ± 0.01 ns 3.11 ± 0.01 ns 1
Quantity/with_numbers/^int 8.37 ± 2.2 ns 8.67 ± 2.2 ns 0.965
Quantity/with_numbers/^int * real 8.05 ± 1.9 ns 8.37 ± 2.2 ns 0.963
Quantity/with_quantity/+y 4.04 ± 0.001 ns 4.04 ± 0.001 ns 1
Quantity/with_quantity//y 3.11 ± 0.001 ns 3.42 ± 0.01 ns 0.909
Quantity/with_self/dimension 2.79 ± 0.009 ns 3.1 ± 0.01 ns 0.9
Quantity/with_self/inv 3.11 ± 0.001 ns 3.41 ± 0.01 ns 0.912
Quantity/with_self/ustrip 2.79 ± 0.01 ns 3.1 ± 0.01 ns 0.903
QuantityArray/broadcasting/multi_array_of_quantities 0.151 ± 0.0011 ms 0.149 ± 0.0011 ms 1.01
QuantityArray/broadcasting/multi_normal_array 0.0499 ± 0.00018 ms 0.0499 ± 0.00018 ms 1
QuantityArray/broadcasting/multi_quantity_array 0.155 ± 0.00072 ms 0.155 ± 0.00071 ms 1
QuantityArray/broadcasting/x^2_array_of_quantities 25.3 ± 1.9 μs 25.9 ± 1.7 μs 0.978
QuantityArray/broadcasting/x^2_normal_array 5.54 ± 0.91 μs 5.86 ± 0.48 μs 0.945
QuantityArray/broadcasting/x^2_quantity_array 7 ± 0.25 μs 7.01 ± 0.27 μs 0.999
QuantityArray/broadcasting/x^4_array_of_quantities 0.0787 ± 0.00052 ms 0.0788 ± 0.00054 ms 0.999
QuantityArray/broadcasting/x^4_normal_array 0.0499 ± 0.00019 ms 0.0499 ± 0.00017 ms 1
QuantityArray/broadcasting/x^4_quantity_array 0.05 ± 0.00017 ms 0.05 ± 0.00017 ms 1
time_to_load 0.131 ± 0.00075 s 0.137 ± 0.0015 s 0.958

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

MilesCranmer commented 6 months ago

Great idea! Can you rebase to #106? As there are changes which might affect the behavior.

MilesCranmer commented 5 months ago

Thanks! Quick initial comment is that this will need to update symbolic_dimensions.jl's ALL_VALUES to be vectors as well, so that new units are also included in symbolic units (like us"km/h").

MilesCranmer commented 5 months ago

One other quick thought is that we will probably need to change INDEX_TYPE to UInt16 in symbolic_dimensions.jl. I could see that if we are to allow users to add units and constants, the 8-bit indexing will probably be closer to overflow. However this also probably means things will be a bit slower.

MilesCranmer commented 5 months ago

Let me know when you're ready and I can give more comments or help out

ChrisRackauckas commented 5 months ago

@ven-k ?

ven-k commented 5 months ago

Now works:

julia> using ModelingToolkit, DynamicQuantities

julia> @register_unit Wb u"kg*m^2*s^−2*A^−1"

julia> @variables k [unit = u"Wb"]
1-element Vector{Num}:
 k
MilesCranmer commented 5 months ago

One other comment: trying to register the same unit twice should have no effect. That way you could register units within a method scope without fear of creating duplicates. Also it would probably make things safer for Revise.jl

ven-k commented 5 months ago

trying to register the same unit twice should have no effect

Just comparing UNIT_VALUE[index].value == value, here, (aka user input value) will not work as latter is an Expr. For running this check, is there an internal helper which can expand it to its intended form?

MilesCranmer commented 5 months ago

Thanks for implementing all of those changes. Sorry to go back on what @devmotion said but do you think we could avoid the OrderedSet use? Since ALL_VALUES is a vector I think ALL_SYMBOLS should be as well – otherwise I get worried there could be an indexing bug introduced somewhere. If not now, maybe later. But I think this is actually already buggy because ALL_SYMBOLS could be left unchanged (since its a set) but ALL_VALUES could have another element added to it – since the function doesn't return early.

Basically I think those constants should just be write once read many arrays. When you push! to the ALL_SYMBOLS within the register units part, you could simply have a check to see if the symbol is already in it – if so, return early.

(Also, the fewer deps the better)

I think the struct could just be something like

struct WriteOnceReadManyVector{T}
    _raw_data::Vector{T}
end

and just define Base.push! and only the methods needed but nothing else (so the array can't be removed from). You could also do the same for WriteOnceReadManyDict for ALL_MAPPING also I think.

MilesCranmer commented 5 months ago

Now that I think about it, trying to register a unit should trigger an error if the symbol is already used AND the value assigned to that symbol is different than the one being registered.

But if the symbol is used already and the value is the same, it could simply be ignored and the unit registration should return early (before pushing anything).

MilesCranmer commented 5 months ago

One option is to simply have a single array storing a tuple of (1) the symbol, (2) the value in Dimensions, and (3) the value in SymbolicDimensionsSingleton. Then we don’t have to worry about issues from expanding one array and not the other.

ven-k commented 5 months ago

One option is to simply have a single array storing a tuple of (1) the symbol, (2) the value in Dimensions, and (3) the value in SymbolicDimensionsSingleton

The SymbolicDimensionsSingleton is defined in SymbolicUnits, which is executed after Units is done updating UNIT_*. Updating one array with all bundled in would be neat; however, would need some major changes in code placement.

MilesCranmer commented 5 months ago

Good point. Okay let’s table that for now and consider in a future PR.

MilesCranmer commented 5 months ago

Also I guess

struct WriteOnceReadMany{A}
    _raw_data::A
end

Might make more sense, as you could put dicts and vectors in the same type.

ven-k commented 5 months ago

The DynamicQuantitiesCollection is similar to WriteOnceReadMany; except that one of its instance is an OrderedSet. I'll rename the former and drop the latter.

MilesCranmer commented 5 months ago

Thanks! I think we are almost done :)

A few more comments:

  1. Could we do a test for whether this works for pre-compilation? For example, if a downstream package wants to register units like:
module MyLibrary

using DynamicQuantities: @register_unit, @u_str

@register_unit Wb u"m^2*kg*s^-2*A^-1"

end

and then, elsewhere, you use this like:

using DynamicQuantities
using MyLibrary

x = u"Wb"
y = us"Wb"

So it would be good to know whether pre-compilation works in this sort of context (not sure how to test this).

2.

I'm not sure whether we could actually unit test this, but I'm just curious how this would all work in the context of Revise.jl. If you are working on a package, would Revise.jl correctly register/un-register units as you add/remove them to a package?

ven-k commented 4 months ago

I've added tests for _precompilation of registerunit in a closed module. And updating https://github.com/SciML/ModelingToolkitStandardLibrary.jl/pull/214 with new units^1, works too.

ven-k commented 4 months ago

I expected types.jl to house all types; now I've moved WriteOnceReadMany to its own file with this https://github.com/SymbolicML/DynamicQuantities.jl/pull/107/commits/5c00b031ee32ae1820d094404a7277ac13a0b9b0.

MilesCranmer commented 4 months ago

Thanks!

(Currently types.jl is supposed to be “exported types” while internal types live in separate files. However I haven’t been completely consistent on this and a future refactor would certainly improve readability.)

MilesCranmer commented 4 months ago

Cool. We're back at 100% test coverage. I did a deep review + some edits and everything is looking good now! Making some final tweaks and then will merge.

Thanks so much for the great contribution!!

MilesCranmer commented 4 months ago

One thing I'm confused about is that the @register_unit does not seem to propagate to other modules. e.g., if I register a unit in ExternalUnitRegistration.jl, and load it with using ExternalUnitRegistration, I can't use it in my current module via u"Wb"... Why is that? Is that intended?

Super weird:

julia> ExternalUnitRegistration.ALL_SYMBOLS
DynamicQuantities.WriteOnceReadMany{Vector{Symbol}}([:m, :g, :s, :A, :K, :cd, :mol, :fm, :pm, :nm  …  :L_sun, :L_bol0, :sigma_T, :au, :pc, :ly, :atm, :kpc, :Mpc, :Gpc])

Where is the :Wb??

Is this some weird thing about precompilation or maybe LOAD_PATH? If I define a module in my REPL it works just fine...

ven-k commented 4 months ago

That's an expected behavior (even without Revise).

So currently the externally defined units can be used with u"unitname" within the scope of the module (and its sub-modules by importing). Outside they should be treated as a var. However it can be used, divided, multiplied as any other unit (but without u"").

Ex: @test Wb/u"m^2*kg*s^-2*A^-1" == 1.0 here^1

MilesCranmer commented 4 months ago

Got it! I like that better actually, feels safer.