Open ariebovenberg opened 2 months ago
While I have not encountered the usecases where a delta needs to be the component distinguishing whether it's months or years (this seems to me like something that is defined by the context, not the data object), I am not sure that optimizing for memory at this level and this point is really necessary.
That said (devil's advocating here a little bit), it looks like Python's timedelta
class does normalize. Is this usecase useful enough to deviate from what is "accepted"?:
>>> timedelta(hours=24)
datetime.timedelta(days=1)
>>> timedelta(hours=25)
datetime.timedelta(days=1, seconds=3600)
I would like to chime in: I also think that normalised deltas are expected/assumed coming from the stdlib.
Having different behaviours for e.g. TimeDelta
(normalised) and DateDelta
(denormalised) is surprising, at least. (And also means DateTimeDelta
has a bit of an identity crisis serving both styles of API.)
Maybe maintaining the denormalised form is out-of-scope for this component? (And could be part of a different module, if necessary?) My uninformed point-of-view is that this library is an improvement over the behaviour of the stdlib when it comes to date and time related structures. Maintaining and exposing a more "advanced" delta API seems like a different thing altogether. That being said, my main gripe here is the departure from the normalised API in the stdlib.
Rough suggestions to move forward:
datetime.timedelta
), the smaller delta components roll over to the big ones.datetime.timedelta
), but normalised internally and exposed with a new property/method (e.g. DateDelta.canonical_format()
will not be normalised unless called after explicit normalisation DateDelta.normalize().canonical_format()
).(If that helps, I'd be happy to try my hand with a few test cases to cover an option I suggested.)
@bibz thanks for taking the time to chime in. Agree with many of your points. I'm AFK at the moment so will go more in depth in a week or so
@bibz Agree that it's good to default to 'expected behavior' (i.e. normalization) for users coming from the standard library. However: this only makes sense so far it's possible. The issue here is that while hours/minutes/seconds/milliseconds/etc can be normalized, calendar days and months cannot.
To summarize: you always end up with at least 3 "buckets" that can be normalized internally, but not between them:
I'm leaning towards minimizing the 'number of buckets'. The 0.5 release DateTimeDelta
currently has 5 buckets: years, months, weeks, days, and hours/minutes/seconds/etc.
Note that pendulum attempts to put everything into one bucket (using general rules like month=30 days), but that results in strange inconsistencies (see their issue tracker for numerous examples)
The issue here is that while hours/minutes/seconds/milliseconds/etc can be normalized, calendar days and months cannot.
To summarize: you always end up with at least 3 "buckets" that can be normalized internally, but not between them
Yes, this makes complete sense. Thank you for the detailed explanation. I agree, minimising the number of buckets while maintaining (strict) equality sounds great.
Note that pendulum attempts to put everything into one bucket (using general rules like month=30 days), but that results in strange inconsistencies (see their issue tracker for numerous examples)
That sounds like a very bad decision for anything in production :scream:
There are two design questions for
DateDelta
:DateDelta
stores years, months, weeks, and days separately. However, in the Gregorian calandar, years are always 12 months, and weeks are always 7 days. It's save on storage to just storemonths
anddays
. The downside is that users may want to deliberately store unnormalized deltas, such that "24 months" isn't the same as "2 years".DateDelta
components may have different signs (P4Y-3M+1W0D
). While this is certainly flexible, it's unclear whether this functionality is actually used. Having one single sign would simplify things—although it may result in exceptions when doing arithmetic (whetheryears(a) + months(b) - years(c)
would raise an exception depending on whether the result would have mixed signs).