HPInc / HP-Digital-Microfluidics

HP Digital Microfluidics Software Platform and Libraries
MIT License
3 stars 1 forks source link

Add overloads for quantity multiplication and division #22

Closed EvanKirshenbaum closed 10 months ago

EvanKirshenbaum commented 10 months ago

Due to the way Python's type hints work (and don't work), while the Quantity package can correctly type check things like addition and subtraction and infer the result type, it can't really do the same for multiplication and division.

I've got it implemented so that the correct type is, in fact returned at runtime, but the code doesn't statically check. I added the a() method so you can assert (and check at runtime), e.g.,

d: Distance = ...
t: Time = ...
v: Volume = (d/t).a(Volume)

but that's both awkward and a bit error prone, as you need to remember to put parens around the quantity expression.

I put this on hold earlier, since I figured that for DMF I'm going to be mostly using single dimensionalities like Volume and Time, but even there, I'm starting to run into things like

    @property
    def update_interval(self) -> Time:
        return self.clock_thread.update_interval

    @update_interval.setter
    def update_interval(self, interval: Time) -> None:
        self.clock_thread.update_interval = interval

    @property
    def update_rate(self) -> Frequency:
        return (1/self.update_interval).a(Frequency)

    @update_rate.setter
    def update_rate(self, rate: Frequency) -> None:
        self.update_interval = (1/rate).a(Time)

I know how to actually do the inference, at least in common cases:

class Distance(BaseDim['Distance']):
    @overload
    def __mul__(self, rhs: float) -> Distance: ... 
    @overload
    def __mul__(self, rhs: Union[Distance, UnitExpr[Distance]]) -> Area: ...
    @overload
    def __mul__(self, rhs: Quant) -> Quant: ... 
    @overload
    def __mul__(self, rhs: UnitExpr) -> Quant: ...  
    def __mul__(self, rhs):

and so on for all the others I know. This doesn't completely solve the problem. For example, it'l handle dividing distance by time to get velocity and then dividing that to get acceleration, but it won't handle dividing distance by squared time, since there's no squared time dimension (unless I add it for that, which I probably should).

Note that this will work for multiplying quantities together and by units, but it absolutely won't work for derived unit expressions. As far as I know, even if I know that D1*D2 is D3, there's no way that I can infer that UnitExpr[D1]*UnitExpr[D2] is UnitExpr[D3]

Migrated from internal repository. Originally created by @EvanKirshenbaum on Jun 15, 2021 at 9:08 AM PDT. Closed on Feb 21, 2023 at 12:06 AM PST.
EvanKirshenbaum commented 10 months ago

This issue was referenced by the following commits before migration:

EvanKirshenbaum commented 10 months ago

Okay, it looks as though this is harder than I thought. Doing the fix for converting between Time and Frequency described above was straightforward, and it's in, but it turns out that MyPy really doesn't like respecifying overloaded operators (specifically operators) in subclasses, even though it looks to me that it should be perfectly safe. It's discussed here.

This isn't that big of an issue, so I'm going to put it on hold for now, but just to capture what I found:

Suppose I start with

D = TypeVar('D', bound='Quant')
class Quant(Generic[D]):
    @overload
    def __mul__(self, rhs: int) -> D: ...
    @overload
    def __mul__(self, rhs: Quant) -> Quant: ...
    def __mul__(self, rhs): ...

    def __rmul__(self, lhs: int) -> D: ...

(Note, these are just scratch classes, not drawn from the package.) If I try to add

class Dist(Quant['Dist']):
    @overload
    def __mul__(self, rhs: int) -> Dist: ...
    @overload
    def __mul__(self, rhs: Dist) -> Area: ...
    @overload
    def __mul__(self, rhs: Quant) -> Quant: ...
    def __mul__(self, rhs): 
        return super().__mul__(rhs)

class Area(Quant['Area']): ...

Mypy complains:

    Mypy: Signature of "__mul__" incompatible with supertype "Quant"

I get the same thing if the generic isn't there.

If the base isn't overloaded (so no multiplying by int), it looks as though it works. I get no errors from

class Quant(Generic[D]):
    def __mul__(self, rhs: Quant) -> Quant: ...

    def __rmul__(self, lhs: int) -> D: ...

class Dist(Quant['Dist']):
    @overload
    def __mul__(self, rhs: Dist) -> Area: ...
    @overload
    def __mul__(self, rhs: Quant) -> Quant: ...
    def __mul__(self, rhs): 
        return super().__mul__(rhs)

class Area(Quant['Area']): ...

but even splitting up a union into two clauses (even without further restriction) breaks it:

class Quant(Generic[D]):
    def __mul__(self, rhs: Union[Quant,int]) -> Quant: ...

    def __rmul__(self, lhs: int) -> D: ...

class Dist(Quant['Dist']):
    @overload
    def __mul__(self, rhs: int) -> Quant: ...
    @overload
    def __mul__(self, rhs: Quant) -> Quant: ...
    def __mul__(self, rhs): 
        return super().__mul__(rhs)

I'll come back to this later.

Migrated from internal repository. Originally created by @EvanKirshenbaum on Jun 15, 2021 at 2:39 PM PDT.
EvanKirshenbaum commented 10 months ago

Coming back to this because I want to at least be able to multiply distances to get areas and volumes, I find that even though I get the error I mentioned above, if I ignore it, everything seems to work. That is, if I say

class Distance(BaseDim): 
    @overload
    def __pow__(self, _rhs: Literal[2]) -> 'Area': ...  
    @overload
    def __pow__(self, _rhs: Literal[3]) -> 'Volume': ... 
    @overload
    def __pow__(self, _rhs: int) -> Quantity: ...
    def __pow__(self, rhs: int) -> Quantity:
        return super().__pow__(rhs)

    @overload   # type: ignore[override]
    def __mul__(self, _rhs: float) -> Distance: ...
    @overload
    def __mul__(self, _rhs: Union[Distance, UnitExpr[Distance]]) -> 'Area': ...
    @overload
    def __mul__(self, _rhs: Union[Area, UnitExpr[Area]]) -> 'Volume': ...
    @overload
    def __mul__(self, _rhs: Quantity) -> Quantity: ...
    @overload
    def __mul__(self, _rhs: UnitExpr) -> Quantity: ...
    def __mul__(self, rhs: Union[float, Quantity, UnitExpr]) ->  Union[Distance, Quantity]:
        return super().__mul__(rhs)

and similarly on Area, then if I later say

x = (3*mm)*(5*mm)
y = 10*acre*ft

x is correctly inferred to be an Area and y is correctly inferred to be a Volume.

And for the Wombat drop size, I can say

pitch = 1.5*mm
def to_vol(height: Distance, gap: Distance) -> Volume:
    return height*(pitch-gap)**2

and it knows that the expression returns a Volume.

Note that I still can't infer that mm**2 is a UnitExpr[Area] or acre*ft is a UnitExpr[Volume]. To handle that, I may need some sort of AreaUE that derives from UnitExpr[Area] and then override Area.as_unit_expr() to return it. The problem is that I will also need to somehow ensure that, as with Quantity, I always get one when I have an Area, no matter how I got there. To do that, I will probably have to actually go through the Dimensionality, the way I call its make_quantity(), which indirects through its quant_class attribute. I could have a similar unit_class and unit_expr_class.

The other bit of trickiness with this approach is that the local Unit subclass will need to be a subclass of the local UnitExpr subclass. I think I can make this work, but it will be a bit tricky.

But in any case, if I can figure it out, then DistanceUnitExpr can override its __mul__(rhs) to say that if rhs is a UnitExpr[Distance], then the result is an AreaUnitExpr. Then, if Distance.base_unit() is defined to return a DistanceUnit (which is a DistanceUnitExpr), mm*mm will become an AreaUnitExpr, which is a UnitExpr[Area].

This will all make dimensions.py really ugly, but it should be confined to that file.

Migrated from internal repository. Originally created by @EvanKirshenbaum on Feb 17, 2023 at 11:57 PM PST.
EvanKirshenbaum commented 10 months ago

Okay, it all works. Yay! And it didn't even take two years to figure out how to do it.

In the end, I cheated. There's now a script, tools/gen_dims.py that emits all of the BaseDim, DerivedDim, UnitExpr, and Unit classes, with appropriate overloads to correctly infer types for multiplication, division, and exponentiation.

Base units are defined as

distance = Dimensionality.base("dist").named("Distance")

If they have aliases (e.g., "LumFlux" for "LumInt"), they can be added using an alias param to named().

Derived units are declared as

area = distance.derived_power(2, "Area")
velocity = distance.derived_quotient(time, "Velocity")
force = mass.derived_product(acceleration, "Force")
vol_conc = volume["Substance"].derived_quotient(volume, "VolumeConcentration")

Dimensions, such as Time, whose classes need extra methods (such as sleep() or in_HMS()) add them using .extra_code()

Finally, you specify the restriction classes you want and any unnamed dimensions you need in order to handle the types of operations you expect to have and emit the code:

restrictions = ("Substance", "Solution", "Solvent")
extras = (time**2,
          mass*distance)

emitter = Emitter(extras=extras, restrictions=restrictions)
emitter.emit()

time**2 is there so you can talk about s**2, and mass*distance is there so that you can say kg*m/s**2. These dimensions will show up as classes named DIM_time2 and DIM_dist1_mass1. They shouldn't be visible to users, but they're needed to make the logic work.

The emitted classes include documentation comments like

    """
    A :class`.Quantity` representing voltage (:class:`Work`\ ``/``:class:`Charge`)

    :class:`Emf`, :class:`EMF`, and :class:`ElecPotential` are aliases.
    """

The "representing" text is taken by un-camel-casing the name and lowercasing. For descriptions that don't map, you can add a description argument to named():

ionizing_rad_dose = work.derived_quotient(mass, "IonizingRadDose", description="ionizing radiation dose")
Migrated from internal repository. Originally created by @EvanKirshenbaum on Feb 21, 2023 at 12:06 AM PST.