Open helgee opened 4 years ago
seconded
In the current implementation this issue occurs by design
@inline function tryparsenext(d::DatePart{'s'}, str, i, len)
val = tryparsenext_base10(str, i, len, min_width(d), max_width(d))
val === nothing && return nothing
ms0, ii = val
len = ii - i
# Here the parser bails when there are more than three decimal places
if len > 3
ms, r = divrem(ms0, Int64(10) ^ (len - 3))
r == 0 || throw(InexactError(:convert, Decimal3, ms0))
else
ms = ms0 * Int64(10) ^ (3 - len)
end
return ms, ii
end
IIUC the problem is that tryparsenext
cannot return multiple date parts. Is that correct?
If so, I see two possible ways to fix this:
μ
and n
. This would fix the problem in Base but not for packages that want to support even higher precision, e.g. AstroTime.jl.Any other ideas?
I have done some experimenting. The most straightforward solution is to use Int128s where we now use Int64s. The use of Int128s with arithmetic (including e.g. fldmod and muladd) seems to slow down the processing by 1.4x (*, muladd with fma) .. 2.5x (fldmod) .. 3.3x (divrem, rem, fld). I did not check generating the DateTime values.
Using a pair of Int64s does not gain much in throughput vs using Int128 and it adds a good deal of additional source code. Nonetheless, keeping DateTime as it is now for the more significant part and using another Int64 for sub-millisecond measure, we would support attosecond resolution.
And it would allow us the choice of either making DateTime and Time work with higher resolution units or keeping DateTime and Time as they are now while introducing, say, HiresDateTime and HiresTime. I prefer the former.
I have implemented a workaround for AstroTime.jl. See here: https://github.com/JuliaAstro/AstroTime.jl/pull/58/commits/7cf0ba45962e8c9921356d8149198fcb7e27eebe
It defines the following parser:
@inline function Dates.tryparsenext(d::Dates.DatePart{'f'}, str, i, len, locale)
next = Dates.tryparsenext_base10(str, i, len)
next === nothing && return nothing
val, idx = next
val /= 10^(idx - i)
return val, idx
end
Would you post a few examples of it doing what it does? (assume using Dates, AstroTime
has been done). Thank you.
Sure!
AstroTime with this patch
julia> using AstroTime; import Dates
julia> str = "2021-03-27T17:25:13.123456789"
"2021-03-27T17:25:13.123456789"
julia> df = Dates.dateformat"yyyy-mm-ddTHH:MM:SS.fffffffff"
dateformat"yyyy-mm-ddTHH:MM:SS.fffffffff"
julia> TAIEpoch(str, df)
2021-03-27T17:25:13.123 TAI
vs. without
julia> df1 = Dates.dateformat"yyyy-mm-ddTHH:MM:SS.sssssssss"
dateformat"yyyy-mm-ddTHH:MM:SS.sssssssss"
julia> TAIEpoch(str, df1)
ERROR: InexactError: convert(Dates.Decimal3, 123456789)
Stacktrace:
[1] tryparsenext
@ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Dates/src/io.jl:153 [inlined]
[2] tryparsenext
@ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Dates/src/io.jl:41 [inlined]
[3] macro expansion
@ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Dates/src/parse.jl:64 [inlined]
[4] tryparsenext_core(str::String, pos::Int64, len::Int64, df::Dates.DateFormat{Symbol("yyyy-mm-ddTHH:MM:SS.sssssssss"), Tuple{Dates.DatePart{'y'}, Dates.Delim{Char, 1}, Dates.DatePart{'m'}, Dates.Delim{Char, 1}, Dates.DatePart{'d'}, Dates.Delim{Char, 1}, Dates.DatePart{'H'}, Dates.Delim{Char, 1}, Dates.DatePart{'M'}, Dates.Delim{Char, 1}, Dates.DatePart{'S'}, Dates.Delim{Char, 1}, Dates.DatePart{'s'}}}, raise::Bool)
@ Dates /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Dates/src/parse.jl:38
[5] macro expansion
@ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Dates/src/parse.jl:150 [inlined]
[6] tryparsenext_internal
@ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Dates/src/parse.jl:125 [inlined]
[7] parse
@ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Dates/src/parse.jl:282 [inlined]
[8] (TAIEpoch{T} where T)(str::String, format::Dates.DateFormat{Symbol("yyyy-mm-ddTHH:MM:SS.sssssssss"), Tuple{Dates.DatePart{'y'}, Dates.Delim{Char, 1}, Dates.DatePart{'m'}, Dates.Delim{Char, 1}, Dates.DatePart{'d'}, Dates.Delim{Char, 1}, Dates.DatePart{'H'}, Dates.Delim{Char, 1}, Dates.DatePart{'M'}, Dates.Delim{Char, 1}, Dates.DatePart{'S'}, Dates.Delim{Char, 1}, Dates.DatePart{'s'}}})
@ AstroTime.Epochs ~/projects/julia/AstroTime/src/Epochs/dates.jl:68
[9] top-level scope
@ REPL[15]:1
step 1 nicely done do you want submillisecond resolution for TAI and other timebases?
do you want submillisecond resolution for TAI and other timebases?
Yes. Very important for deep space navigation: https://github.com/JuliaAstro/AstroTime.jl/issues/28#issue-326825392 And I do hope that somebody will use the library for that purpose eventually 😅
Epochs/DateTimes in AstroTime.jl are represented like this:
struct Epoch{S<:TimeScale, T} <: Dates.AbstractDateTime
scale::S
second::Int64
fraction::T
function Epoch{S}(second::Int64, fraction::T) where {S<:TimeScale, T}
return new{S, T}(S(), second, fraction)
end
end
ususally with T == Float64
. Thus, I want to be able to parse all decimal places of the current second into fraction
.
I hit this too, my use case is when I wrap some AWS services, they provide microsecond precision in their date time,
which I believe comes from Python because the default unit of Python datetime
is microsecond so does IBM Quantum REST service, and working
with them currently is not very pleasant since I need an extra step to remove the extra precision.
After some discussion with @ericphanson on slack, we think a fix to this could be parameterize the unit, and parse the string to highest available precision, e.g
"2021-01-28T00:21:11.849352123"
will be parsed to nanosecond precision
"2021-01-28T00:21:11.849352"
will be parse to microsecond precision,
"2021-01-28T00:21:11.849"
will be parsed to milisecond precision, in this way we could fix the problem without breaking existing code. But I think will need someone to take a second look on this, since I'm not faimilar with the implementation of Dates.
also, cc: @omus
I think parameterizing DateTime
with a limited set of precision types is reasonable. Adding a type parameter will definitely be a breaking change though.
To be clear, since Time
already supports nanoseconds, I think there's no API issue or worry of breaking changes for supporting parsing times with nanosecond precision. I think the issue around that is for DateTime
s (which only support microseconds), but that should probably be a separate issue, in my opinion.
Yes, that is a major issue as it is not just about parsing: one ought represent the result of parsing to precision parsed.
What is the status of this issue?
This issue has not advanced. If suitable, consider NanoDates.jl. That has worked for others with similar needs.
I originally added this as a comment to #29339 but it deserves its own issue.
The parser in
Dates
is limited to parsing milliseconds even thoughTime
and data types in other packages, e.g. AstroTime.jl, support higher precision.