finos / morphir-elm

Tools to work with the Morphir IR in Elm.
https://package.elm-lang.org/packages/finos/morphir-elm/latest
Apache License 2.0
46 stars 65 forks source link

Consider creating a Quantity type to replace Int in business use-cases. #299

Open DamianAtWork opened 3 years ago

DamianAtWork commented 3 years ago

Is your feature request related to a problem? Please describe. Mapping Int to a bigger Integer type on target platforms leads to ambiguity when actual Ints are needed.

Describe the solution you'd like We should instead make a Quantity type to represent business cases for an Int.

Additional context

Looking at Decimal and the fact that decimal's make more business sense has me considering whether morphir should also have a Quantity type. If we had that type then we could map Morphir.SDK.Basics.Int directly to Int. One of the reasons not having Int map to int is problematic is when you want to do something that actually needs an int. An example of this that matters from the business side is setting the scale of a decimal. The scale is expected to be an integer, but when we pass in a Morphir.SDK.Basics.Int we are essentially passing in a Long which means to be safe I have to finagle things (in this case I take the min of scala.Int.MaxValue and the passed in scale.

 package morphir.sdk
 object Int {
   type Int = scala.Long
 }
object Decimal {
  type Decimal = java.math.Decimal

  def truncate(n:morphir.sdk.Int.Int)(d:Decimal):Decimal = {
    // Below we take the minimum of Int.MaxValue and n (which is actually a scala.Long)
    val scale = math.min(scala.Int.MaxValue, n).toInt
    d.setScale(scale, java.math.RoundingMode.DOWN)
  }
}

This is just one example of the kind of possible weirdness.

Now if instead for business cases where we are interested in quantities then a Quantity type can always map to a Long or even a BigInt (though for compatibility with Spark and DBs I wouldn't do that).

AttilaMihaly commented 3 years ago

Mapping Int to Int probably makes sense based on this: https://package.elm-lang.org/packages/elm/core/latest/Basics#Int

Note: Int math is well-defined in the range -2^31 to 2^31 - 1. Outside of that range, the behavior is determined by the compilation target. When generating JavaScript, the safe range expands to -2^53 to 2^53 - 1 for some operations, but if we generate WebAssembly some day, we would do the traditional integer overflow. This quirk is necessary to get good performance on quirky compilation targets.

If we do that though we definitely need a bigger integer type. I'm a bit torn on Quantity. Once we get into that should we start thinking about units of measure? https://package.elm-lang.org/packages/ianmackenzie/elm-units/latest/Quantity

stephengoldbaum commented 3 years ago

I have to say that from a business perspective, Int vs Long seems arbitrary. Perhaps we look at something more like Whole number with optional min/max constraints. Then when code generating to other languages, those constraints can guide whether to use Int, Long, or BigInteger, etc.

DamianAtWork commented 3 years ago

I like the whole number idea. I was also thinking of this thing of using size to guide the generated number size, but it seems like a lot of work. Like I can think that places where numbers are literals and we don't expose modification, then you can derive a size constraint, but yeah while useful it is complex.

DamianAtWork commented 3 years ago

Can't we use Phantom types as a way to do units of measure?

stephengoldbaum commented 3 years ago

I think that's how we'd implement the Elm side if we choose to support it in Morphir. We'd still need to figure out the Morphir and backend aspects. How do you see supporting units of measure in Scala? Or do we just not, and dictate using modeling if you want that support?

stephengoldbaum commented 3 years ago

Sorry, i clicked the "Close" button instead of canceling my comment.

DamianAtWork commented 3 years ago

Well Scala has Phantom Types and type refinements as ways of doing units of measure.

AttilaMihaly commented 3 years ago

Why don't we try to solve the original problem before we get into more complex topics.

I have to say that from a business perspective, Int vs Long seems arbitrary.

Int has a small enough range that there is a real risk of overflowing it with financial data. The biggest Int number is only around 2 billion. While the biggest Long is around 9 quintillion which is very unlikely to be exhausted. So Long vs Int actually makes a huge difference.

Perhaps we look at something more like Whole number with optional min/max constraints. Then when code generating to other languages, those constraints can guide whether to use Int, Long, or BigInteger, etc.

As @DamianAtWork mentioned before ranges are very hard from a type system perspective. Just imagine multiplying two such numbers, what's the range of that? How fast is that increasing as you multiply?

In my opinion for financial calculations we either go with Long to keep things simple or an arbitrary-precision integer with no range constraints to begin with. We can always add constraints later when research comes up with a good solution for it :)

DamianAtWork commented 3 years ago

So what if we do Int to Int and WholeNumber (with no min/max) to Long?

AttilaMihaly commented 3 years ago

So what if we do Int to Int and WholeNumber (with no min/max) to Long?

That's fine. I just don't like the WholeNumber name because it deviates from the mathematical naming which is simply integer: https://en.wikipedia.org/wiki/Integer

Why don't we just call it Integer and put it in it's own module outside of Basics. So you would have Basics.Int and Integer.Integer. I think that will work nicely with Decimal.Decimal.

DamianAtWork commented 3 years ago

So what if we do Int to Int and WholeNumber (with no min/max) to Long?

That's fine. I just don't like the WholeNumber name because it deviates from the mathematical naming which is simply integer: https://en.wikipedia.org/wiki/Integer

Why don't we just call it Integer and put it in it's own module outside of Basics. So you would have Basics.Int and Integer.Integer. I think that will work nicely with Decimal.Decimal.

I love this idea. Naming things is hard and I wasn't sold on WholeNumber either.

stephengoldbaum commented 3 years ago

How do modelers know when to use Int vs Integer?

AttilaMihaly commented 3 years ago

How do modelers know when to use Int vs Integer?

The simple answer is that they should read the docs but I assume that's not your point. If you mean that the names should be less similar then I don't see how that would help. . Say we make it WholeNumber. I'm almost certain that modelers won't know what's the purpose of that and fall back to Int to represent integers.

If we did want to differentiate them more I would rename Int to Int32 to make it clear that it's fixed precision. We have that option in our SDK and IR but cannot change Elm's Basics.Int.

I think this gets into the area of tooling around model quality and warning people about potential issues. I don't think we achieve much by coming up with unusual naming conventions. I think we actually have the risk of alienating people with math background if we do that.

DamianAtWork commented 3 years ago

The simple answer is that they should read the docs

Yeah, I think this is an opportunity for a teachable moment. Business modelers really should not default to the primitive types anyway (if we are being honest). For business modeling the following is infinitely more meaningful:

module Quantity exposing(Quantity, fromInt)

type Quantity = Quantity Int

fromInt: Int -> Quantity
fromInt value = Quantity value

Just as for Strings

module Cusip exposing (Cusip, fromString)

type Cusip = Cusip String

fromString: String -> Maybe Cusip
fromString text =
 {|-- Code to parse -}
DamianAtWork commented 3 years ago

Ideally tooling would allow us maybe to have some of these "templates" available out of the box, i.e.:

morphir-elm new module --template ValueType
stephengoldbaum commented 3 years ago

I'm less concerned with the naming than the business conceptualization. I anticipate that there will be a whole lot of inconsistency in modelers choosing Int vs Integer (or Int32 or whatever). It seems like Integer and Decimal should be the norm and Int and Float when modelers know the constraints. We'll need to make that clear.

DamianAtWork commented 3 years ago

Yeah, I think most libs that do some sort of type mapping publish a table in their docs. We should add such a table.

AttilaMihaly commented 3 years ago

It seems like Integer and Decimal should be the norm and Int and Float when modelers know the constraints. We'll need to make that clear.

I completely agree with that. I think we should go beyond just documenting it. I like how the Elm tooling enforces a lot of best-practices and simply doesn't allow you to do it in any other way. Not using primitive types is a good rule for business models.

But I guess the conclusion is that we should:

AttilaMihaly commented 3 years ago

I guess that conclusion still has choices so it's not a conclusion :)

DamianAtWork commented 3 years ago

So concerning the first task -

Add an Morphir.SDK.Basics.Int32 type to our SDK that simply maps to Basics.Int.

Are we talking adding Int32 here:

https://github.com/finos/morphir-elm/blob/c95c17aa90c30ee2f754b8cc1b7ed08c7a1b8d71/src/Morphir/IR/SDK/Basics.elm#L42-L43

AttilaMihaly commented 3 years ago

Are we talking adding Int32 here:

Yes. Or more accurately: replacing Int with Int32 and mapping it properly in the frontend. Optionally we can also add a type alias in Morphir.SDK.Basics so that modelers can use Int32 if they want to be specific. And we should put that in our trainings instead of Elm's Basics.Int.

At some point we may even reject Basics.Int in the frontend but that might be too much for now.