rickb777 / date

A Go package for working with dates
https://godoc.org/github.com/rickb777/date
BSD 3-Clause "New" or "Revised" License
136 stars 24 forks source link

period: P7D is normalised to P1W even when normalisation is not set #12

Closed khtufteland closed 1 year ago

khtufteland commented 4 years ago

I'm trying to parse the period P7D and I don't want any normalization.

p, _ := period.Parse("P7D", false)

However, when I print this I get P1W.

It seems like this happens due to the code in the String() function: https://github.com/rickb777/date/blob/d2bc4afd7cd27d07a25daa88f0ed92822a1ec19e/period/format.go#L107-L113

To me, this doesn't seem right. I believe that since normalize is deactivated should the result be P7D.

rickb777 commented 4 years ago

This is working as designed (and I hope as documented). It doesn't help you, but weeks are treated as 7-day multiples. The Period type does not store the number of weeks, only the number of days, so there is no way to avoid this apparent normalisation. (At least, not without a redesign; I have tried other approaches but the current one persists for now.)

smyrman commented 4 years ago

Well, as week is not stored, would it make more sense to output 7D instead of 1W and only accept weeks when parsing (not when printing)?

smyrman commented 4 years ago

I mean 8D is not transferred to 1W1D. Using only days in the output seams more consistent.

rickb777 commented 4 years ago

Fair point. It's an easy enough change and probably won't affect people adversely because both forms are equivalent.

rickb777 commented 4 years ago

The formatting code never outputs weeks unless there are no remaining days. So you won't (/shouldn't) ever see 1W1D or anything like it, but you would see 52W rather than 364D, for example.

rickb777 commented 4 years ago

Because weeks are handled synthetically in Period (there is a days field but not one for weeks), it might be argued that the 'correct' solution is to add a weeks field and let normalisation handle the situation as it does with all the other fields.

I haven't decided whether I think this is the right choice yet.

quenbyako commented 4 years ago

@rickb777 i agree with you, that 52W is more readable, but i think that code must be predicted, so when i going to parse 364D i expect 364D, and not anything else) and, you know, this equal to 12W7D, i expect to see 12W7D (under the hood it can be literaly everything honestly, i care only for benchmark maybe, and i'm even though not sure)

Also, thanks for this package, i don't know why it has so few stars, it's sooooo useful

smyrman commented 4 years ago

FYI, from Wikipedia, the ISO 8601 is documented as eihter containing weeks, or containing the other compoenents (years, months, days, hours, minutes, seconds):

P[n]Y[n]M[n]DT[n]H[n]M[n]S
P[n]W
P<date>T<time>

https://en.wikipedia.org/wiki/ISO_8601

In other words, mixing weeks with anything is disallowed.

The ISO standard is also stating this: https://www.freestandardsdownload.com/iso-8601-1-2019-pdf.html

The complete representation of the expression of a duration should be one of: a) ["P"][i]["Y"][i]["M"][i]["D"]["T"][i]["H"][i]["M"][i]["S"] b) ["P"][i]["W"]

rickb777 commented 3 years ago

I have been working on a solution; it's not yet ready. Sorry.

quenbyako commented 3 years ago

@rickb777 oh, you're alive! That's a good sign :laughing:

rickb777 commented 3 years ago

The branch https://github.com/rickb777/date/tree/direct-week-support carries the work done so far. There probably need to be API changes (implies v2 release).

  1. Parse() and MustParse() have been altered to give clearer normalisation choices. Whilst this is not strictly necessary, it reduces the number of data conversions that would potentially be needed.
  2. Days() and Weeks() now behave differently. This is a breaking change.
  3. New method DaysAndWeeks() is a non-breaking change but behaves like Days() used to.
  4. New function NewYMWD() is a non-breaking change.
  5. New function New2() is rather rubbish but needed to avoid changing New(). Comments are requested on this @quenbyako
  6. New internal weeks field implements full support for weeks.
  7. New internal denormal flag reduces the work needed if Normalise() is called repeatedly.
quenbyako commented 3 years ago

@rickb777 Thanks a lot! You rock!

So i looked up into New2 implementation and... Well... Am i able to make PR with refactoring this func? Also i don't agree with New2 naming, honestly. It could be NewV2 or, even better, NewWithWeeks which is, well, explicitly describing differents between these two funcs

quenbyako commented 3 years ago

@rickb777 want to ask, why all values in Period time multiplied by 10? Looks pretty strange honestly

Also, i'm not agree with denormal parameter name, cuz, i think that it could be named something like improper (like improper fractions)

smyrman commented 3 years ago

New method DaysAndWeeks() is a non-breaking change but behaves like Days() used to.

This is invalid according to the spec though... Both ISO and RFC. They only allow, as far as I can see, either weeks, or days, months, years etc.

UPDATE.. Or maybe it is valid according to the RFC. The spec is a bit unclear with their use of /:

https://tools.ietf.org/html/rfc3339

Durations:

   dur-second        = 1*DIGIT "S"
   dur-minute        = 1*DIGIT "M" [dur-second]
   dur-hour          = 1*DIGIT "H" [dur-minute]
   dur-time          = "T" (dur-hour / dur-minute / dur-second)
   dur-day           = 1*DIGIT "D"
   dur-week          = 1*DIGIT "W"
   dur-month         = 1*DIGIT "M" [dur-day]
   dur-year          = 1*DIGIT "Y" [dur-month]
   dur-date          = (dur-day / dur-month / dur-year) [dur-time]

   duration          = "P" (dur-date / dur-time / dur-week)

UPDATE 2: It's not valid according to the RPC spec neither... / means OR, [] means optional (suffix). but as one can se dur-month include an optional dur-day and the end, and dur-year include an optional dur-month at the end (which again include dur-day) etc...

Therefore there is no valid format that combines weeks with anything else. Which is why I would suggest supporting parse only.

smyrman commented 3 years ago

Anyway, if there is a use case for the mixed format with weeks, keeping it is probably fine :-)

smyrman commented 3 years ago

NewWithWeeks which is, well, explicitly describing differents

What about instead doing:

func New(year int, month time.Month, day int) Date
func NewW(w int) Period

Drop NewYMWD and New2 as they are initializers for an illegal format.

type PeriodOfWeeks int
func (p PeriodOfWeeks) Date() Date

PS! As far as we are concerned (me and @khtufteland), the only purpose of this issue is to prevent automatic normalization into weeks. Direct week support is not something we care about. We have switched to our own implementation now that fits our exact needs though.

As for everyone else proper direct week support would probably only allow weeks to be non-zero if you have zero date type and zero time components. E.g. imagine:

type Period{
   // Allow either date or weeks
   date Date
   weeks int16
   // Time component only allowed when weeks is 0.
   hours, minutes, seconds int16
}

@rickb777 want to ask, why all values in Period time multiplied by 10? Looks pretty strange honestly

I agree this looks very strange. Perhaps to support sub-second precision without a nsec parameter @rickb777?


Of topic, but have to mention it. IsPositive appear to ignore the case where the period IsZero. Would it not be easier to return !p.IsNegative()? Then it should also cover the IsZero case.

quenbyako commented 3 years ago

@smyrman: the only purpose of this issue is to prevent automatic normalization into weeks.

If so, why do you need so complicated PeriodOfWeeks type? This is worse than just renaming New to NewWithWeeks and set New as deprecated. I don't agree with this crutch, honestly

smyrman commented 3 years ago

If so, why do you need so complicated PeriodOfWeeks type?

Just saw there was a PeriodOfDays and copy-pasted the same logic into PeriodOfWeeks. It's not an important addition; just one that match the existing code for PeriodOfDays.

My only point is that New that accepts both weeks and months, days etc, is encouraging a combination of parameters that is not supported by the spec, which again can causes serialized formats not to work with other implementations (e.g. databases, parsers written in other languages etc). Which again increase bugs and incompatibilities in peoples programs.

Ideally, all output from period serialization, should be guaranteed valid according to RFC3999.

quenbyako commented 3 years ago

@smyrman main problem of this package is that it has ambiguous New method, which doesn't fit even in stdlib time package (time has Date constructor, which is more clear naming)

I think that we have single problem for now and it is a problem with weeks. This package working amazingly well on production already (our team use it), so i don't think that it doesn't fit in RFC. One year in a row and we don't have any problems with this problem.

smyrman commented 3 years ago

@quenbyako - I talking about the interface of the proposed New2 (not the name).

The RFC allows a format P1W, but does not allow P1WT1H for instance. Neither does it allow P1Y1W or P4W1D.

quenbyako commented 3 years ago

@smyrman i mean that there is no problem if the function takes in deliberately incorrect data if it formats according to the specification. what is the problem? i don't understand

smyrman commented 3 years ago

@smyrman i mean that there is no problem if the function takes in deliberately incorrect data if it formats according to the specification. what is the problem?

There is no problem with that. But that's the whole point, it's (the output format) is not valid according to the spec.

The way I undertand the RFC spec (which is very brief), we got three allowed formats for the period type.

 period = "P" (dur-date / dur-time / dur-week)

Which I read as one of:

  1. "P" + dur-date
  2. "P" + dur-time
  3. "P" + dur-week

Just using the number 1 in the example for simplicity here.

Format 1 can be expanded to combine any any of the params except week (in the expected order). E.g.

smyrman commented 3 years ago

But maybe I am missing something when I look at the direct week support branch?

rickb777 commented 3 years ago

I hope I can answer a few of the questions raised.

Firstly, RFC3339 only mentions durations (a.k.a periods, loosely) as a side-issue. The Wikipedia description is more detailed and much more useful. I don't have access to the ISO standard document.

The reason they start with P is due to the name 'period', which may be historical. However, I've used periods like this in other languages (mainly Java) and the term used tends to be 'period' rather than 'duration' (e.g. in JodaTime ), so I went with that too.

The format is not a union of seven fields of which only one is in use. Rather it is the concatenation of seven fields. It's legitimate to have values such as P3Y6M4DT12H30M5.2S.

The last non-zero field may have a decimal fraction. I don't know whether a limit on the number of decimal places is defined the ISO standard; I made the design decision that a single decimal place can be easily supported. This is the reason for the x10 internal representation, which gives fixed-point values down to 0.1 without needing to use floats. The advantage of this is that two Periods can be correctly compared using ==.

I have also experimented with different internal representations, for example having extra fields to hold the fraction, whilst keeping the main fields just integers. This might be more complex, though i might revisit it to see if there's a better fit.

The original (and still main branch) implementation did not have weeks. The reason was that time.Time also works without weeks, e.g. time.AddDate() and time.Clock() exist but there is no time.Week() method. It was a mistake to omit weeks from Period though.

rickb777 commented 3 years ago

I merged PR#14 yesterday but I've found that there are errors and test failures. I'll roll back parts of it fixing the tests and getting rid of the floats (I've taken quite a lot of care to avoid floating point arithmetic to avoid conversion errors and because the period fields are almost always integers).

rickb777 commented 2 years ago

Thank you everyone for your contribution.

Resolving this was proving more tricky than I expected, mostly due to backward-compatibility issues. I have decided to re-implement period rather than continue to patch it. The new period package will be https://github.com/rickb777/period. This will incorporate your remarks indirectly.

rickb777 commented 1 year ago

Closing this issue now that the new period module is available.