SymbolicML / DynamicQuantities.jl

Efficient and type-stable physical quantities in Julia
https://symbolicml.org/DynamicQuantities.jl/dev/
Apache License 2.0
120 stars 15 forks source link

Feature Suggestion: SymbolicAffineDimensions #144

Open Deduction42 opened 2 weeks ago

Deduction42 commented 2 weeks ago

I've worked with numerous unit packages (like Python's Pint and Unitful.jl) and really love the fundamental approach of this package, which is really the fundamental approach of the SI units system. Unfortunately, real-world units are weird, and I find that in engineering applications, affine units are everywhere (pressure units are gauge, so zero at atmospheric, and temperatures are always either Fahrenheit or Celsius), and it's difficult for this package to handle them. I basically have to fallback to Unitful.jl first and then apply conversions from Unitful to DynamicQuantities which means that if I have to set up unit tables for data sources, there's a lot of dynamic dispatch happening.

Would it be possible to extend AbstractSymbolicDimensions to include a SymbolicAffineDimensions? It's basically a SymbolicDimensions with an offset property. Moreover, a SymbolicDimensions is easy to promote to a SymbolicAffineDimensions, as you just set offset=0. Also, any function that uses a Quantity{T,SymbolicAffineDimesnions} simply converts subtracts the offset first, and then treats it as a Quantity{T, SymbolicDimesnions}. This would allow me to be rid of Unitful.jl dependencies altogether, as well as make custom unit tables that behave much more performantly.

MilesCranmer commented 2 weeks ago

Hi @Deduction42,

Just to ask, why not just define a converter to SI units for your own work?

So, my sense is that adding this capability will complicate everything about the package a lot, and, being the maintainer, I need to make some compromises on features. DynamicQuantities.jl really tries to be as lightweight as possible. At its core, the Dimensions object is just a 7-field struct, with Quantity adding a single field for the value. I suppose I find it hard to justify the significant complexity associated with these types of units, especially considering I never use them in my own work (I just convert affine units whenever I encounter them).

This increased complexity means more maintenance work for me, longer package load times, and slower code for all downstream users. If you are really eager to have these, perhaps you can try to make a package (or fork) DynamicAffineQuantities.jl to try it out? All of the abstract types should enable you to define custom dimensional objects. There is an example of making a custom dimensions here: https://symbolicml.org/DynamicQuantities.jl/dev/examples/#3.-Using-dimensional-angles.

Cheers, Miles

Deduction42 commented 2 weeks ago

I did define a converter for SI units for my own work. This is why I submitted Pull Request #141. I'm currently using Unitful.jl as a "unit conversion funnel" to get everything into a form that DynamicQuantities will accept. The problem is, getting the units converted in a way that minimizes performance hits from type instability is a lot of work, and I'm trying to bring more people in my organization to use more first principles in our ML models, including engineering unit packages and Julia. Anything that reduces this friction is huge for me.

The dream would be to have something like

unit_string = "°F"
x = 5*uparse(unit_string)
>> 258.15000000000003 K

I mean it could be displayed as 5 °F but at this point, I don't even care; 5 °F should just be represented as 258.15K under the hood. All I want is an easy way for anyone on my team to convert any garbage units a customer provides us into proper SI units in a way that's type-stable and easy to code. Maybe this is mostly a problem for American/Canadian engineers because the U.S. still uses the horrible English system (why does "brake horsespower" even exist?), and Canadian adoption of SI is halfhearted at best.

I'm still trying to work through how SymbolicUnits are defined and what their use is. My best understanding is that it's a convenience form of re-displaying the fundamental units as different numbers? They don't seem to have Dimensions under the hood. Is there anything else you're using SymbolicUnits for?

MilesCranmer commented 2 weeks ago

I see, thanks. Yeah if it is just the import and export, I definitely think it's easier and is a possibility to include. My main objection is actually doing processing directly on affine units, which seems like a can of worms. Though note I still don't have time to write something for this as I never use them in my own stuff. If you can write a PR I can take a look though.

SymbolicDimensions and Dimensions are both subtypes of AbstractDimensions and behave in a similar way. Basically SymbolicDimensions is like one very massive Dimensions object with a separate field for every single unit. e.g., km and m are different fields.

It's written as a sparse vector to speed things up though, as having 100+ fields in a struct, many of which are 0, would be impractical. The various custom behavior is just dealing with that form of it.

RGonTheNoble commented 2 weeks ago

You are absolutely right about trying to do processing directly in affine units. The typical response to this idea seems to be "AWW HECK NO" because other unit packages don't even try. Both Pint (for Python) and Unitful.jl will scream at you if division by affine units is attempted. As an example, consider the responses to the two semantically identical expressions below:

5.0u"m" / 298u"K"
0.016778523489932886 m K^-1

5.0u"m" / 24.85u"°C"
ERROR: AffineError: an invalid operation was attempted with affine units: °C

The first works but the second does not, even though we both know that they're essentially the same thing. But if the conversion to SI units happens under the hood, you can actually do this, providing the functionality that other packages can't.

RGonTheNoble commented 2 weeks ago

The way I would approach this is to have different kinds of dimensional transformations, like ScaledDimensions and AffineDimensions.

@kwdef struct ScaledDimenions{D<:AbstractDimensions}
    scale :: T
    symbol :: Symbol
    dimensions :: D
end

@kwdef struct AffineDimensions{D<:AbstractDimensions}
    scale :: T
    offset :: T
    symbol :: Symbol
    dimensions :: D
end

This means we could have something like

km = ScaledDimensions(scale=1000, symbol=:km, dimension=Dimensions(length=1))
degC = AffineDimesnions(scale=1, offset=273.15, symbol=:°C, dimension=Dimensions(temperature=1))

Every mathematical operation would first convert all quantities to having regular dimensions and do the math in SI units and overload that way. If you want the original units, you just convert everything afterward. It might take me a while, but I could definitely build this if this is something you think you can support.

Deduction42 commented 1 week ago

Hello again, I'm sorry to bother you like this, but I'm reading through your SymbolicDimensions work and I'm trouble understanding its design philosophy. I understand that Symbolic Dimensions is a really long dimensional space that is a sparse vector of mostly zeros. But wouldn't it be simpler to define a SymbolicDimension as a regular Dimension times a scale? This is what the @register_unit macro input looks like.

It just looks like a lot of functionality I'm planning to write is basically a simplified SymbolicDimensions that allows for affine transformations and encapsulates conversion info internally in a very simple structure.

struct AffineDimensions{D} <: AbstractDimensions{D}
    symbol :: Symbol
    dimensions :: Dimensions{D}
    scale  :: Float64
    offset :: Float64
end

This would allow you to do something like:

C = AffineDimensions(:C, u"K", offset = -273.15)

and then something like

F = AffineDimensions(:F, C, scale=9/5, offset=32)

and finally

qF = Quantity(0,F)
>> 0 F
uexpand(qF)
>> 255.3722222222222 K

If you take away the offset, it's the encapsulated version of SymbolicDimensions. I'm just wondering if there's some hidden pitfall that caused you to avoid taking this encapsulated route.

MilesCranmer commented 1 week ago

But wouldn't it be simpler to define a SymbolicDimension as a regular Dimension times a scale?

This is similar to how it works. The long sparse vector is needed to reference the symbols that need to get printed out. But those are also stored as regular Dimensions here: https://github.com/SymbolicML/DynamicQuantities.jl/blob/6f4331a41817cc6ed88ff0ac7622d237cfae3b5f/src/symbolic_dimensions.jl#L12-L14

Deduction42 commented 1 week ago

Okay, I think I understand the philosophy now. My underlying philosophy behind units was to convert everything to SI as quickly as possible, but SymbolicDimensions allows you to work directly with compound non-SI units like "mi/hr" and some people may want that, even if I don't. However, the philosophy for affine units is non-negotiable; you must get rid of them as soon as possible because some operations are hopelessly ambiguous (like addition) or unintuitive (like multiplication/division).