fsharp / fslang-suggestions

The place to make suggestions, discuss and vote on F# language and core library features
345 stars 21 forks source link

Ease conversion between Units of Measure (UoM) and undecorated numerals and simplify casting #892

Open abelbraaksma opened 4 years ago

abelbraaksma commented 4 years ago

Ease of use for Units of Measure (UoM)

After being asked 3-4x in just as many weeks on Slack by different newcomers and experienced programmers alike, I propose we introduce a set of intrinsics related to UoM to make it less painful to convert to undecorated numerals and back, and to simplify conversion between numeric types.

I suggest to do that by adding a few sensibly-named functions to the LanguagePrimitives.

Remove UoM

Pair each LanguagePrimitives.XXXWithMeasure with a LanguagePrimitives.XXXWithoutMeasure.

Remove UoM generically

Add LanguagePrimitives.RemoveMeasure: 'T<UoM> -> 'T.

Add UoM generically

It's already possible to add UoM for specific types, but not generically for any type. We could add LanguagePrimitives.WithMeasure: 'T -> 'T<UoM>, where 'T is one of the signed integral types.

Support conversion

If at all possible, I suggest to make all cast operators sensitive to the target, to make this seamless. That means, allowing:

let x: float<cm> = float 45<cm>

See also SO question: https://stackoverflow.com/questions/1894250/f-unit-of-measure-casting-without-losing-the-measure-type

Allow no-op cast on collections for UoM

It's currently impossible to remove Array<int<cm>> from its cm UoM, unless you loop over it. Same is true for any other collection.

EDIT: the retype trick below can actually be adopted for this, thanks to @BillHally (comment).

let myArray = [|3<cm>; 42<cm>|]
let x: int array = myArray |> Array.map LanguagePrimitives.WithoutMeasure  // this should be optimized away

Existing way of doing this

The existing way of approaching this problem in F# by multiplying by 1<_> or 1.0<_>, but this is type-specific. For generics, you can only do so by using unsafe compiler-specific code, like this (courtesy @gusty):

// use compiler specific code
let inline retype (x: 'T) : 'U = (# "" x: 'U #) 

// remove UoM decoration
let inline removeUoM (x: '``T<'M>``) =
    let _ = x * (LanguagePrimitives.GenericOne : 'T)
    retype x :'T

Conversion from float<cm> to int<cm> etc is currently possible by reinterpreting the cast, but hard to discover even for experienced users:

x |> float |> LanguagePrimitives.FloatWithMeasure

Pros and Cons

The advantages of making this adjustment to F# are numerous, just to name a few:

The disadvantages of making this adjustment to F# are:

Extra information

Estimated cost (XS, S, M, L, XL, XXL): S

I think this isn't very hard to do, except for some optimizations perhaps. The compiler mostly already has ways of coding this internally. Ultimately, we should try to keep the balance of type-safety with UoM, but really shouldn't make it too hard to work with them.

Related library is FSharp.UMX: https://github.com/fsprojects/FSharp.UMX

Related language suggestions:

Related suggestions (and SO questions):

Just a bunch of the many roaming questions related to UoM, illustrating the trouble people perceive with it:

Affidavit (please submit!)

Please tick this by placing a cross in the box:

Please tick all that apply:

matthewcrews commented 4 years ago

I would love this. Would greatly simplify some of the work that I am doing. UoM are great but there are some painful edge cases that need to be sanded down and smoothed over.

abelbraaksma commented 4 years ago

@dsyme, there were several discussions on slack recently, which led to this suggestion, and with the awesome work done by @pblasucci on widening the UoM types to all primitives, I'd like to add this as well to the (relatively painless) improvements for UoM's.

Could you consider this "approved-in-principle"? If so, I'll gladly write the RFC for this.

pblasucci commented 4 years ago

So, recapping, there are 5 different suggestions here, each of which improves quality of life for working with units of measure:

  1. Remove UoM
  2. Remove UoM generically
  3. Add UoM generically
  4. Support conversion
  5. Allow no-op cast on collections for UoM

Of the five, 1 is trivial. 2, 3 seem like they'd be straight-forward -- but I've not done much analysis. 4, 5 are more concerning -- but only in the case of ensuring well-optimized code.

As an aside, for my own work, item 4 (conversions) is the one I smack up against most frequently.

@abelbraaksma @dsyme Might it be worth taking decisions on 1, 2+3, 4+5 separately (as I could see the merits of carving up the RFCs, and the associated work, into those three somewhat more digestible chunks)?

abelbraaksma commented 4 years ago

I've no problem splitting them in two RFCs, but ideally, I (or you ;) would implement the chunk in one go. I think only (5) is slightly problematic from an implementation standpoint.

The easiest ones are those that only need library functions.

pblasucci commented 4 years ago

This is maybe not the best place to ask, but... I have a process question:

Would we "feature gate" these behind multiple different "language features"? Would we reuse one of the existing gates? Does it really matter?

abelbraaksma commented 4 years ago

A bunch of these will end up in FSharp.Core, which complicates "feature gates" due to it being a compiled lib, and afaik, apart from complex features like string interpolation, a new function is just a new (isolated) function in FSharp.Core, meaning that no "gate" is necessary to enable it: people can opt-in by calling that function.

For the part that requires changes in the compiler, I'd assume a single feature gate for like uom-improvements or something would be more than sufficient, but ultimately that's up to @cartermp and @dsyme, to whom I'd like to kindly ask:

Could you consider this "approved-in-principle"? If so, I'll gladly write the RFC(s) for this.

pblasucci commented 3 years ago

@abelbraaksma wrote:

I've no problem splitting them in two RFCs, but ideally, I (or you ;) would implement the chunk in one go.

I'd be happy to work on this (with Abel or solo)... assuming @cartermp or @dsyme will go ahead an mark it as approved :wink:

BillHally commented 3 years ago

@abelbraaksma wrote:

  1. Allow no-op cast on collections for UoM

and:

I think only (5) is slightly problematic from an implementation standpoint.

I may be misunderstanding something, but the retype function you provide will also work for collections.

We've been using essentially the same code for a couple of years to convert e.g. int<a>[] to int<b>[] without any problems.

Here's some example code in sharplab.

abelbraaksma commented 3 years ago

@BillHally, thanks for pointing that out. Retyping indeed works, but we may need to ensure only primitives are targeted to prevent illegal conversions. I haven't looked that deep in potential corner cases, but as you said, it should be a noop as long as the primitive types match.

pblasucci commented 3 years ago

I'd be happy to work on this (with Abel or solo)... assuming @cartermp or @dsyme will go ahead an mark it as approved :wink:

Ping?

dsyme commented 2 years ago

I apologise for missing this despite being prompted several times. Put it down to the pandemic or something.

I can see the issues here. When UoM were originally designed, the aim was for "unit soundness" that is quite a strict criteria. Over time this has been weakened allowing conversions, integers etc.

Concretely I agree with addressing each of the above in LanguagePrimitives but am concerned about the actual details. Specifically anything involving the conversion operators may be problematic, since these are integrated with constraint solving, but perhaps it's doable if a unitized target type is known.

(2) and (3) require 'T<'m> as a construct, or a new kind of constraint, or a special type-checking rule. Thus they look complex.

I'll belatedly mark this as approved but I'm concerned that the details won't satisfy when it comes to an RFC

abelbraaksma commented 2 years ago

Thanks @dsyme to approve this in principle. I'll set out to write an RFC for this, or perhaps two, so we can split the feature (i.e. to have the complexities that 'T<'m> brings in only one of them).

pblasucci commented 2 years ago

For what it's worth, I was recently playing around with this for something that came up at my day job. I now fully understand what @dsyme means about items 2 and 3. I would definitely split the two of them into a separate RFC.

But stepping back a moment, and thinking about the larger "request", which I understand as "make it easier to do conversions involving units of measure", I'd like to suggest that part of adding items 1, 4, and 5 could be to surface them somewhere more discoverable than LanguagePrimitives. In fact, it probably makes sense to also alias (move?) the existing ___WithMeasure functions. Everything could be neatly packaged into a static type called Units or UnitsOfMeasure or Whatever-is-deemed-most-appropriate. But why a static type? Because something like Units.clear(number : float<'M>) : float, with ~13 overloads (one per numeric primitive) has arguably better "developer ergonomics" than 13 separate ___WithoutMeasure functions.

At least, that's what's come out of the things I've been doing at work lately.

abelbraaksma commented 2 years ago

@pblasucci, I like those suggestions. Ease-of-discovery would certainly help. But I'm not sure what @dsyme, @KevinRansom et. al. would think about adding "another way of doing the same thing". Perhaps LanguagePrimitives in hind-sight was a bad location for any of this, and we should fix that, or we are just stuck with it.