Open vtjnash opened 5 years ago
I think this discussion on discourse is related.
@vtjnash although we have a Dates.Microsecond
, the DateTime
constructor doesn't use that as per the present implementation. We can obviously add the microsecond argument in the constructor, point is do we have a use case for the implementation?
with Int64
values and a proleptic Gregorian calendar
Using units of milliseconds covers 292_277_024 years.
Using units of microseconds covers 292_277 years.
Using units of nanoseconds covers 292 years.
@vtjnash although we have a
Dates.Microsecond
, theDateTime
constructor doesn't use that as per the present implementation. We can obviously add the microsecond argument in the constructor, point is do we have a use case for the implementation?
An upvote for this issue - I have a JSON API which I'd like Julia to talk to and parse into structs - I'm new to the language - and settled on JSON3, HTTP as the apparent libraries of choice.
All works startlingly well except with DateTime, which won't accept microseconds. As I have no control over the API, and all the parsing is handled in the libraries, I actually don't know what the solution to this problem is except to leave the dates as String in the struct. My google-fu has failed me on other solutions ... but this No DateTime solution is distasteful, and I fear the extra pain down the road - to the extent that I will consider reverting to scala for this use case.
For me, this use case for the constructor is for someone like me, trying Julia. I'm sure a reasonably competent person expert has a workaround to this "microsecond API JSON problem" - a stupid one I grant you as I really don't care about the 4-6th digits.
However, I can't find a workaround, and it is scary to me as a newcomer, that managing a fairly simple third party API appears to be a problem beyond my skills - this is normally a trivial thing in modern languages.
If I understand correctly, the addition of the extra constructor makes that whole scary problem never appear...
I'll also post an issue in JSON3, to see if there is a known workaround.
Restated - a microsecond constructors eases DateTime interop with other languages.
DateTime is limited to millisecond precision by design, but Time has nanosecond precision. It would be nice to be able use DateFormats with multiple s
codes to specify the desired sub-second resolution for Time objects, but no matter how many s
codes are given it seems to always behave as if three were given. This is a problem for both formatting and parsing:
julia> t=Time(1,2,3,456,789)
01:02:03.456789
julia> Dates.format(t, dateformat"HH:MM:SS.ssssss")
"01:02:03.456000"
julia> Time("01:02:03.456789", dateformat"HH:MM:SS.ssssss")
ERROR: InexactError: convert(Dates.Decimal3, 456789)
I also have a use case for this. We use nanosecond-precision timestamps for research into networked AV media. The data structures we use for debugging output microsecond precision times. It would be useful to parse these for analysis & visualisation in Julia.
I just spent quite a bit of time* trying to find out why my log timestamps are being rounded to the nearest ms. It turns out that the answer is Dates.format
. As @david-macmahon mentions above, it silently accepts "ssssss" but rounds the Time
value to ms before formatting it!
So, I whip up a microsecond formatting function, how hard can it be?
julia> formatus(t::Time) = string(Dates.format(t, "HH:MM:SS"), ".", lpad(microsecond(t), 6, "0"))
julia> formatus(Time(Nanosecond(1_000_000_000)))
"00:00:01.000000"
julia> formatus(Time(Nanosecond(1_000_123_000)))
"00:00:01.000123"
... so far so good, but then:
julia> formatus(Time(Nanosecond(1_123_123_000)))
"00:00:01.000123"
The milliseconds part disappeared!
The way Dates.Time
handles sub-second values turns out to be a bit surprising.
You can put ns in, and get ns back out again...
julia> Time(Nanosecond(5))
00:00:00.000000005
julia> nanosecond(Time(Nanosecond(5)))
5
Except that if you put in more than 999 ns you don't get all of them back...
julia> Time(Nanosecond(5005))
00:00:00.000005005
julia> nanosecond(Time(Nanosecond(5005)))
5
With μs you're not allowed to use values over 999.
julia> Time(Microsecond(5))
00:00:00.000005
julia> Time(Microsecond(5005))
ERROR: ArgumentError: Microsecond: 5005 out of range (0:999)
Is Dates.Time
designed this way on purpose?
nanosecond(Time(Nanosecond(5005))) == 5
seems like a bug, but this test case makes it look like that is intended:
dt = Dates.DateTime(2000, 1, 1, 23, 59, 59, 50)
t = Dates.Time(dt)
@test Dates.hour(t) == 23
@test Dates.minute(t) == 59
@test Dates.second(t) == 59
@test Dates.millisecond(t) == 50
@test Dates.microsecond(t) == 0
@test Dates.nanosecond(t) == 0
Is there a use-case where this is desirable? I get that DateTime
only has ms resolution, but 50ms should be ===
50_000 μs irrespective of the underlying resolution.
It would seem less surprising to have:
@test Dates.microsecond(t) == 50_000
@test Dates.nanosecond(t) == 50_000_000
*(My log messages originate on an 8-bit micro, measuring a physical system with ~ μs resolution. They are passed to another 8-bit micro over SPI, then through a serial port to a raspberry Pi running Julia, then through WiFi to Julia on a mac, then through an RPC link to a log monitoring GUI, I checked everywhere else before suspecting that Dates.format might be broken)
Dates
is carefully written (it is not broken). For more specifics on periods, please see the docs for Dates. That said, it is well known that providing direct support for more complete subsecond handling would be good. Dates (including Times and DateTimes) operate with 64 bit underlying values. That precludes some of the widest, broadest lossless interconversions. And yes, in some ways Nanosecond
s are special.
The documentation for microsecond
and nanosecond
says just:
The microsecond of a
Time
as anInt64
. The nanosecond of aTime
as anInt64
.
There is no mention that the return values are modulo 1000 and modulo 1000000.
Nor is there any discussion in the documentation about why it might be useful for microsecond(t)
to return microseconds modulo 1000.
Time
type is not mentioned at all in the main body of the documentation.
It only appears in the "API Reference" section, where all it says is:
Time
wraps aNanosecond
and represents a specific moment in a 24-hour day.
To me, given that documentation, this seems broken:
nanosecond(Time(Nanosecond(5005))) == 5
@notinaboat I have previously had the same problem. The workaround in your case is:
julia> using Printf: @sprintf
julia> formatus(t::Time) = @sprintf("%s.%03d%03d", Dates.format(t, "HH:MM:SS"), millisecond(t), microsecond(t))
formatus (generic function with 1 method)
julia> formatus(Time(Nanosecond(1_234_567_890)))
"00:00:01.234567"
This should probably be in a new issue, but…
To solve the documentation issue of what millisecond
et al. return without breaking the current API, I suggest the following wording for all the millisecond
, microsecond
and nanosecond
docstrings, using the example of millisecond
:
millisecond(t::Time) -> Int64
The millisecond part of a `Time` as an `Int64`, lying in the range 0 to 999 ms.
# Example
```jldoctest
julia> t = Time(12, 34, 56, 123, 456, 789)
12:34:56.123456789
julia> millisecond(t)
123
```[ignore this—extra stuff to allow for inclusion of the backticks]
See [`Time`](@ref) for more on the parts making up `Time`.
For Time
, I suggest removing the mention of it being backed by Nanosecond
s and instead say:
Time
`Time` represents a specific moment in a 24-hour day. A `Time` type consists
of parts representing hours (`h`, up to 23), minutes (`mi`, up to 59)`,
seconds (`s`, up to 59), milliseconds (`ms`, up to 999), microseconds (`us`, up to 999)
and nanoseconds (`ns`, up to 999).
Or much of the above could instead go into the second docstring shown, which currently is:
Time(h, [mi, s, ms, us, ns]) -> Time
Construct a `Time` by parts. Arguments must be convertible to `Int64`.
Happy to make a PR if this sounds sensible.
None of the above actually addresses the issue at hand (that DateFormat
doesn't cope with the resolution supported by Time
s).
The implementation issue which is confusing is that Time
s are just wrappers around a Nanosecond
, and are most basically constructed as you did with e.g. Time(Nanosecond(1_000_000_000))
. But this means there is no check for the value being over 999 as for Millisecond
and Microsecond
. (Specifically, Dates.validargs(Time, ...)
is never called.)
Perhaps Nanosecond <: TimePeriod
should be distinct from a NanosecondInstant <: Instant
used for Time
?
[Edited to refer correctly to TimePeriod
rather than Period
which is for dates.]
@anowacki helpful workaround.
regarding _this issue: DateFormat doesn't handle the full resolution of Times --
The Period used to wrap Time
onto an Int64 value is Nanosecond
while
the Period used to wrap DateTime
onto an Int64 value is Millisecond
;
so some fine grained resolution becomes lossy when mixing the two.
just for fun
nbits(x) = iszero(x) ? 0 : ceil(Int, log2(1+abs(x)));
nbits(x::Dates.AbstractTime) = nbits(value(x));
value(x::Time) = x.instant.value;
value(x::DateTime) = x.instant.periods.value;
nanos_per_millisec= 1_000_000;
nbits(typemax(DateTime)), nbits(typemax(Time)), nbits(nanos_per_millisec)
# 62, 47, 20
Nanosecond does not limit values to 999 because the finest resolved unit of time (the smallest Period) must be available as an Int64 valued destination to keep internal Time
related functions very fast.
Nanosecond does not limit values to 999 because the finest resolved unit of time (the smallest Period) must be available as an Int64 valued destination to keep internal Time related functions very fast.
Indeed—the issue is not with the constructors of any of the TimePeriod
s, all of which accept an arbitrary Int64
quite rightly.
Instead, the issue is because of the Time
constructors, two of which are important here:
Time(::Nanosecond)
: the 'internal' constructor which creates a Time
based on some number of nanoseconds since midnight; andTime(::TimePeriod, ::TimePeriod...)
: the public-facing constructor which creates a Time
from any combination of Second
s, Minute
s, etc.When you do Time(Millisecond(999))
and get a Time(0, 0, 0, 999)
, you are calling the second of the two above constructors (the API one). This in turn calls Dates.validargs
to check that all of the individual TimePeriod
s supplied are in the correct range. (You could well argue that this is unnecessary—why not allow a Time
adjusted from midnight by 1000 ms rather than only 999 ms? After all, you can do Time(Millisecond(999), Millisecond(1))
!) Arguably multiple arguments of the same type should not be allowed…
However, when you do Time(Nanosecond(1000))
, the first (internal) method is called, and Dates.validargs
elided. This seems unintentional to me—presumably the second method above was intended to catch any combination of TimePeriod
s.
Perhaps a resolution to all of this is something like:
struct TimeInstant{P<:TimePeriod} <: Instant
periods::P
end
struct Time <: TimeType
instant::Nanosecond
Time(instant::TimeInstant{Nanosecond}) = new(mod(instant.periods, 86400000000000))
end
and then call the inner constructor by wrapping Nanosecond
in TimeInstant
?
@anowacki thanks for posting the sprintf
example.
@JeffreySarnoff I see that Date
and Time
are stored with different resolution, but none of my issues or concerns have anything to do with Date
. My use case involves just Time
. Time
has ns resolution, which is enough (μs would also be fine in my case). The issue is with the interface.
Its seems there are two fundamental problems here.
First, implementation detail is leaking out and polluting the generic interface as @anowacki describes above.
Second, the current interface implies the unusual idea that ms, μs and ns should be treated as seperate "columns" (i.e. that each of these values is limited to three decimal digits).
It is natural to think of hours, minutes and seconds as seperate "columns", but every other time interface that I'm familiar with treats fractional seconds simply as fractional seconds. ISO 8601 allows seconds.xxxx...
with as much precision as you like. Unix has struct timeval
with a tv_usec
field. Go, rust and swift, all return nanoseconds when you ask for nanoseconds. Rust allows you to specify wether you want the whole time value in ns or just the sub-second fraction. This seems nice.
The decision to return nanoseconds modulo 1000 by default seems very unusual. I can't think of any application where it is more convenient to have fractional seconds divided into seperate 3-digit columns by default.
As it stands if I want to get the fractional seconds part of a Julia Time
I have to type 1000000 * millisecond(t) + 1000 * microsecond(t) + nanosecond(t)
.
e.g.
go has Microseconds
and Nanoseconds
accessors that do the expected thing:
t := time.Duration(10_123_456_789)
fmt.Println(t.Microseconds())
10123456
fmt.Println(t.Nanoseconds())
10123456789
Rust lets you ask for as_micros
or subsec_micros
:
let t = Duration::from_nanos(10_123_456_789);
println!("{:?}", t);
10.123456789s
println!("{:?}", t.as_micros());
10123456
println!("{:?}", t.as_nanos());
10123456789
println!("{:?}", t.subsec_micros());
123456
println!("{:?}", t.subsec_nanos());
123456789
Swift has only one sub-second accessor: nanosecond
.
var t = DateComponents.init(nanosecond:10_123_456_789)
print("\(t.nanosecond!)")
10123456789
@notinaboat Thank you for the clarification and specificity of your notes. Although I have not thought about them for a year or two, I am aware of these issues and the preference for uniformity.
It is possible to experiment with a modified Dates as an external package [say, AltDates.jl] that begins as a duplicate of Dates from https://github.com/JuliaLang/julia/tree/master/stdlib/Dates. For that, a deep dive into Dates is required -- many of the unobvious decisions are guided by the need for unflappable performance, others by designing in calculations that are free from error, and there are aspects of the design that facilitate higher level operations.
Experimenting with the least change that provides the most satisfaction is recommended, and benchmarking the use of Dates and AltDates with their tests and with e.g. DataFrames and TimeSeries matters.
@notinaboat I agree that the accessor function name is somewhat misleading. Maybe we could add functions to obtain the value of a Period
as whatever unit is desired, but I don't know how to avoid ambiguities with the current names.
What you want is done with Dates.value. Note that the unit of the integer you obtain depends on the type of the Period
. For a Time
type, you get back the nanoseconds:
julia> Time(13,11,35,621,355,112) |> nanosecond # get out only the nanosecond part
112
julia> Time(13,11,35,621,355,112) |> Dates.value # get the raw integer, here in nanoseconds
47495621355112
That's interesting @jebej.
Dates.value is documented as Dates.value(x::Period) -> Int64
,
but Time <: Period
is false
.
In my own code, I've decided to use Float64 seconds everywhere instead of Time
and only convert to Time
just before Dates.format
. The Dates interface as it stands clearly has both internal inconsistencies and surprising inconsistencies with other similar interfaces. That makes me think it is likely to change in Julia 2.0 (hopefully?), so for now I'm going to avoid relying on its quirks as much as possible.
@JeffreySarnoff, with regard to alternative time packages, I've found one that looks serious https://github.com/FugroRoames/Chrono.jl. However I don't have time available to work on time packages.
I am spending a little time posting here in the hope that whoever next works on Dates
has some documentation of problems experienced by a real user.
FWIW, a few suggested changes focused on the principle of least surprise...
Document value(::Time)
.
Deprecate nanosecond
, and microsecond
, are they even used anywhere? (could keep millisecond
, it's behaviour is unsurprising).
Depreciate the Time(h, [mi, s, ms, us, ns])
constructor.
Time
is defined as "Time wraps a Nanosecond", but Time(7)
results in 7 hours, this is surprising.
Depreciate the Time(period::TimePeriod...)
constructor.
It is inconsistent with the default Time(::Nanosecond)
constructor.
Add a kw args constructor Time(;hour=0, minute=0, second=0, nanosecond=0)
hour=7
, or hour=Hour(7)
etc.millisecond=
and microsecond=
options, then accept values [0,999] and [0,999999] respectively and throw an ArgumentError if more than one sub-second argument is specified).Make Time(Nanosecond)
and Nanosecond(Time)
consistent. e.g.
# Current behaviour...
julia> Nanosecond(Time(Nanosecond(1000)))
0 nanoseconds
julia> Dates.Nanosecond(t::Time) = t.instant
julia> Nanosecond(Time(Nanosecond(1000))) 1000 nanoseconds
julia> Nanosecond(Time(Nanosecond(10_123_456_789))) 10123456789 nanoseconds
- After deprecations have been in place for long enough, add:
- `Time(x::TimePeriod) = Time(Nanosecond(tons(x)))`
- `nanosecond(t::Time) = value(t) % Int(1e9)`
- `Time(ns::Integer) = Time(Nanosecond(ns))`
- Add `Time(seconds::AbstractFloat) = Time(Nanosecond(round(Int64, 1e9 * seconds)))`
I was not suggesting that you rework Dates -- my notes are for others who will find this at some later date. For all things carefully timed (overkill for your use) https://github.com/JuliaAstro/AstroTime.jl [nonetheless if you need to check results, this is reliable]. It is well understood by the community of people who write time libraries that using Floats is not safe and may introduce subtle, and difficult to track down (or to recognize) errors. That said, nobody understands your uses better than you.
Here is a thread from the early days of Julia https://julia-dev.narkive.com/c21H1xHJ/the-path-to-a-time-type and another discussing higher resolution for times https://discourse.julialang.org/t/higher-resolution-datetime-timestamp/3069/28
While trying to parse our AV logs (https://ci.appveyor.com/api/projects/JuliaLang/julia/history?recordsNumber=1&startBuildId=20355614), I realized that we apparently can't handle times more precise than milliseconds (even though we have Dates.Microsecond):