Make type representing a number of time units passed parametrizable

serokell / o-clock

:hourglass: Type-safe time units in Haskell

Mozilla Public License 2.0

49 stars 6 forks source link

Make type representing a number of time units passed parametrizable #105

Closed artemohanjanyan closed 6 years ago

artemohanjanyan commented 6 years ago

Just like in C++ http://ru.cppreference.com/w/cpp/chrono/duration Time is often serialized in real world, let's make it possible!

chshersh commented 6 years ago

@artemohanjanyan I don't see the difference from current implementation. Time data type already stores number of time units as type level rational number:

https://github.com/serokell/o-clock/blob/03bed1c7b7ea7b2c20c803e9aaca1e1c8d26a5ee/src/Time/Units.hs#L140

artemohanjanyan commented 6 years ago

@ChShersh type level rational number (Rat) represents a time unit itself, I was referring to field unTime which is of type RatioNat which I may want to change (there was a discussion about that).

chshersh commented 6 years ago

@artemohanjanyan That's an interesting proposal. But I don't think that it's a good idea. Parametrising Time data type over type of unTime field has the following downsides:

Possibility to break toUnit @to @from . toUnit @from @to ≡ id law associated with data type (which was the initial and main idea of this package).
Can affect performance (not sure, requires benchmarks).
toUnit function can behave very badly when type level rational is something like 5 / 7, conversion can lose information.
Type and implementation of every function inside package will become much scarier.
Read and Show instances very deeply rely on the fact that unTime is rational. It's a huge pain in the ass to write those instances properly in order to save semantic for previous behaviour and support any numeric time for unTime as well.

So personally I don't think it worth it.

artemohanjanyan commented 6 years ago

@ChShersh

Some rules for conversion can be established which will imply toUnit @to @from . toUnit @from @to ≡ id providing unTime is rational.
It sure requires benchmarks because this package already provides some polymorphism (which can affect performance sometimes, not sure about this package in particular) and relies on rational numbers based on arbitrary-precision naturals. Which makes me believe that performance is not a priority in this package and makes me seriously doubt that performance will be affected by my proposal and that it should be an argument against this change at all.
Same as 1.
I don't think it will become scarier at all. Almost all of the implementation is much scarier than some type parameter already, and I had no problems understanding it. The concept of parameterizing unTime is very simple and shouldn't scare anybody. I don't see how this is a problem at all.
One way to do this is to create two classes for reading and showing unTime which will be used in the actual implementation of Read and Show (instance (KnownUnitName unit, CustomShow rep) => Show (Time unit rep) ...). Instances for RatioNat will preserve current behavior.
I'm sure there are other (better) ways to solve this issue. Don't see a problem with this one as well.

I'm not saying that this is something trivial or quick to do. I'm also not proposing any concrete design for this change (yet). But I believe that it is something worth doing. Wanting different type for unTime is completely natural.

With RatioNat I can control nor performance neither precision.
It is painful to create any sensible implementation of serializing and deserializing for data with Time if I interact with some external system which doesn't use arbitrary-precision rational numbers for measuring time (I haven't seen such system yet). The only way to do this is to hide the structure of my data, manually control the precision of Time inside it and provide some extra interface for time inside my data, which sounds like a lot of boilerplate which could be avoided.
It is arguable that the current design is not very intuitive. We had a discussion about that. In short, a programmer can mistakenly assume that Time Minute can't store anything of less precision than a minute and then write some buggy code. Something like Time Minute Int would describe that fact perfectly, on the other hand.
If some thing is misused (like it was in our case) some blame should go to the design of that thing sometimes.
I also believe that it will be absolutely clear what Int means in Time Minute Int and it won't confuse anyone. Data.Ratio does the same trick. Programmers constantly pick types for numbers in general.
Also, std::chrono::duration from C++ is designed in the same way. I don't regard any part of std as an example of a perfect design, but std::chrono seems to be accepted by C++ community rather good.
Theoretical properties are nice, but practice is often not as good as pure theory. Ignoring practice makes this library much less valuable. Time is a complicated thing which often comes in various forms in practice. There should be a good way to deal with time in a good language like Haskell. You are just ignoring a big portion of use-cases right now.

vrom911 commented 6 years ago

I guess these changes are too radical to the base idea of o-clock package. Instead you can create the separate library with suitable tests (specific for laws for your data type) for your needs explicitly saying how that different from the concept of this library.

Parametrisation by the type of unTime will decrease value of type level rational multiplier. The trick with Rat makes sense only for rational, but for Int it's useless, it will lead to redundant work at runtime.

In Integer case you can use time-units library, or any other alternative, because as you see from the benchmarks for this kind of work o-clock is less performant.

You're highlighting that it could help in resolving performance issues, but the goal of our library is the correctness and type safety. Implementing your proposal will sacrifice correctness of laws which is not acceptable by design.

Also, I'm personally using o-clock in my projects and I can see that migration to the new design is going to be painful.

artemohanjanyan commented 6 years ago

@vrom911 I think you misunderstood me. I explicitly stated that performance is not an issue and made an emphasis on correctness and type-safety of my proposal.

chshersh commented 6 years ago

@artemohanjanyan Making type of unTime polymorphic has benefits only in terms of performance (in case you want better performance you can switch to another internal representation of Time). Otherwise there's no sense in it at all. This is the only benefit of this proposal. So if performance is not an issue for you, I don't see any reasons to implement this.

With RatioNat I can control nor performance neither precision.

Performance is not a goal of o-clock. If you want better performance — just use another library. Rational numbers has infinite precision. Just do rounding to any digit you want in your application. I don't see any problems.

It is painful to create any sensible implementation of serializing and deserializing for data with Time

I don't see any pain. Just use floor function if you want integer. Create newtype over Time MyTime.

which sounds like a lot of boilerplate which could be avoided.

All boilerplate is three lines. I don't think it's too much:

newtype MyTime = MyTime { unMyTime :: Time Minute }
instance ToJSON Duration where toJSON = toJSON . floor @_ @Natural . unMyTime
instance FromJSON Duration where parseJSON = withScientific "duration" (pure . MyTime . minute . realToFrac)

If you're not satisfied with this behavior of serialisation, just create:

newtype MyTime = MyTime Natural

It is arguable that the current design is not very intuitive.

This is subjective. For people who can spend 5 minutes of their invaluable time and read the README everything is clear. Or, in another words, it's arguable that Time Minute Int is more intuitive than Time Minute.

Theoretical properties are nice, but practice is often not as good as pure theory.

Yeah, but why sacrifice theory if you actually can make practical implementation of theory? 🤔

o-clock just doesn't fit in model you want. It's a package with another idea.

Once you have Time Minute Int you realize that there's no much sense for Minute to be type level rational number as @vrom911 said. You can't have a cake and eat it. So newtype Time (t :: Rat) and newtype Time (t :: Type) (a :: Type) are very different types and shouldn't be supported in one library. These are two different use cases. I don't say that your approach is bad (even if I don't like it, but I don't say that it's bad). But it's just doesn't fit to o-clock. It won't make o-clock strictly better.

vrom911 commented 6 years ago

I agree with @ChShersh here. I think that your solution is out of the picture of this library, so feel free to implement your own approach in separate library (if you'd like you can base on o-clock), but probably what you're suggesting doesn't fit in here and won't bring a lot of benefits if implemented the way it's done in o-clock from my point of view.

artemohanjanyan commented 6 years ago

@ChShersh

I mentioned performance only because you brought it up. You said it would make it worse, I said it wouldn't, case closed.

All boilerplate is three lines.

Your example is broken because parseJSON . toJSON is not id. You'll have to export MyTime without constructor. All the code complementary to MyTime will be the boilerplate I was talking about. It would make this library useless in that case (and many more cases).

Yeah, but why sacrifice theory if you actually can make practical implementation of theory?

I'm also not proposing to sacrifice theory at all, I'm proposing to have an option to drop it.

Theory is made up, and practice is what programming is about and what we have to deal with. It's nice when you can apply theory as-is. But often you can't do it.

Once you have Time Minute Int you realize that there's no much sense for Minute to be type level rational number as @vrom911 said

That's just another obstacle you are claiming to be hard to overcome. Type-level rationals make just as much sense with integers as with rationals. They represent the notion of a ratio which is inherent in time units no matter what precision or number type you want for the unTime.

vrom911 commented 6 years ago

It just seems like o-clock library is not the right choice for your use-case, nothing else. You're ignoring the advice given more than once (also discussed in Slack) that newtype MyTime = MyTime Natural is the better solution for you. Also changing the logic of the whole library is not the sane thing to do only if it's not suitable for a single use-case. Again, if you somehow need Rat on type level for your situation (:man_shrugging:) you can create your own Time type with two parameters for your issue.

I will repeat myself, but I don't see these changes to be useful. It is usually not the case when you need to use different kinds of time measures (like in Int and like it's in o-clock library) in one library. You're not going to use Time t Int and Time t Double (whatever you want) at the same time at least because you're not able to operate between them smoothly. So, it's one time choice. If you need Int case, then o-clock is not the best decision as other libraries can perform better in that. No need to shoot yourself in the leg sacrificing maintainability without any benefits.

This is my opinion. And I see @ChShersh thoughts on this. I would also like to hear @int-index opinion before we make final decision for o-clock library future.

int-index commented 6 years ago

I would also like to hear @int-index opinion before we make final decision for o-clock library future.

I think parametrization by internal representation is an extremely leaky abstraction. Right now the library provides the user with a type for Time measured in specified units. There's no precision loss, we have nice properties for it, etc. "Internal representation" is an engineering concept, not a semantic concept, so maybe it looks like a sensible thing to parametrize over in an engineering-focused language like C++, but in Haskell I tend to rely on semantics and laws to write my code.

Let's say we do parametrize by different representations, now we have deal with different instances:

Time t (Ratio Natural) — arbitrary precision, easy to reason about, we already know the laws.
Time t Rational — sort of like Time t (Ratio Natural) but can be negative?? Some of our operations assume non-negative time.
Time t Natural — no fractional part, completely different laws, different operations.
Time t Integer — sort of like Time t Natural but can be negative.
Time t Double or Time t Float — sort of like Time t Rational, except there is loss of precision. Need to specify when, where, how much, etc. Also can be negative...
Time t (Fixed e) — limited precision, each operation does rounding. What sort of rounding (ceiling, flooring, approximation)? What laws hold?
Time t Word or Time t Int — sort of like Time t Natural and Time t Integer but can overflow? What operations become unsafe?

Every possible internal representation has different implications, different laws, and most of available operations in o-clock do not apply to them out of the box.

I think parametrization by internal representation will be detrimental in multiple ways:

complexity of type signatures will increase
safety of operations becomes questionable
reasoning with laws becomes mostly impossible or infeasible

If we want to go into the direction of representing time with limited precision (e.g. only whole units) then we should avoid talking about "internal representations" and start talking about "precision".

artemohanjanyan commented 6 years ago

newtype MyTime = MyTime Natural

@vrom911 My specific use-case is the one I heired from you in yt-utilities. It would solve that specific bug (providing I get rid of all the dependencies on o-clock which I didn't put there, thanks for the advice again), but it has nothing to do with the leaked abstraction in this library which led to the bug in the first place. I think I made a clear point when I generalized the issue I see in the library. I'm not writing here so that I don't need to fix the bug I found. You are just ignoring all my reasoning about why the design of o-clock is not good.

If we want to go into the direction of representing time with limited precision (e.g. only whole units) then we should avoid talking about "internal representations" and start talking about "precision".

@int-index This is the first time I read something helpful I can agree with.

vrom911 commented 6 years ago

@artemohanjanyan you're again mixing up the bug in your project with the lack of flexibility in o-clock.

If you would like to look more professional I would advise you not to behave so disrespectful to people who have opinion that differs from yours and also not to point with finger on people in making something wrong as we are all grown ups and understand that it's not one person's decision to put or not put something in the library. Speaking about the bug you're blaming me in, it didn't make any harm as there were no such functional back then which is leading to bug behaviour now. I'm not saying that it's good decision I made. But the fact that o-clock doesn't fit into some situations doesn't mean that it should be radically changed to cover more use cases if it loses more than it gains.

but it has nothing to do with the leaked abstraction in this library

I guess it's not right to say about "leaked abstraction" in o-clock library since this library has very small amount of abstraction. On the contrary, adding more polymorphism could lead to introducing leaked abstraction.

You are just ignoring all my reasoning

I also made a clear point of what I'm thinking about what you're proposing. I tried to explain why I think that way. If you think that I didn't make reasonable answer you can ask me to elaborate more.

the design of o-clock is not good.

And also I don't think that any library would agree to change all that is laying in the library's core idea just because of the some unsuitable use-cases (which the library is not trying to hide but says explicitly that it's not the right library for your approach). If you are saying that o-clock is bad designed library (I assume that you read the docs, code or at least something) then I can't do anything with this fact. Unfortunately, the library can't make everyone happy (if you look at benchmarks you can see that we understand this fact completely).

The goal of o-clock is not to become the universal standard for time units but to implement exactly the idea with clear and straight laws and semantics described in the documentation of the package. Implementing your proposal conflicts with this goal.

I want to emphasise that everything above is my own opinion and doesn't necessary represent the thoughts of all the authors of o-clock.

artemohanjanyan commented 6 years ago

@vrom911 I brought it up only because you kept telling me that fixing the bug had something to do with my reasoning about o-clock in general. I also don't blame anyone and don't think that the fact that some mistake was made by somebody makes anyone a bad programmer or something like that. But I think that it's not fair to reply with advice about my specific use-case (your words: "You're ignoring the advice given more than once", "Also changing the logic of the whole library is not the sane thing to do only if it's not suitable for a single use-case") when I use my single use-case only as a premise to this argument and don't mention it as much as you do.

If you are saying that o-clock is bad designed library

I'm not saying that it is bad. o-clock is a nice library, but I believe that it has flaws. I also would like not to have my arguments about the design of o-clock and my proposal (I already know why you don't like it) mixed. The former is more important, I don't care as much for the latter.

int-index commented 6 years ago

This is the first time I read something helpful I can agree with.

Alright, so we agree that paramerization by internal representation isn't going to happen. In this case I'm closing the issue.

We can discuss parametrization by precision in another issue, but I would prefer if it had a concrete design to discuss (at least an idea of how to represent precision and how it would affect operations).