Closed visr closed 2 years ago
I am not near a computer this week to check, but I am wondering what python's cftime does in this case. I thought that the proleptic Gregorian calendar would apply the regular rules of our current calendar befor 1582. It is not clear to me why it should be have irregularities of the year 300.
Indeed it's not yet clear to me what's wrong with the year 300 specifically. @hboisgon encountered this issue, and this was the only date in a 10_000 year series that had this issue.
I tried python's cftime, there the issue is not present. It refuses to create DateTimeProlepticGregorian(300, 2, 29)
, and DateTimeProlepticGregorian(300, 3, 1)
is indeed March 1.
The issue seems to come from the datenum_ac
function, which returns the same value for Feb 29 and March 1 for year 300 (which gets scaled to 1900), and only happens for gregorian=true
, in this code. But I cannot really pin it down.
Here are some more years with this problem:
filter(y -> Dates.month(DateTimeProlepticGregorian(y, 3, 1)) !== 3,-2500:2500)
output:
7-element Vector{Int64}:
-2101
-1701
-1301
-901
-501
-101
300
I went back the to original publication "Meeus, Jean (1998) Astronomical Algorithms (2nd Edition). Willmann-Bell,
Virginia. p. 63" and it claims that the argument (Z - 1867_216.25)/36524.25
is always positive:
However, for DateTimeProlepticGregorian(300, 3, 1)
it becomes exactly -1.
In the past python's CFTime used Meeus algorithm. But they use now a much simpler and clearer alorithm.
As a test, it ported to julia. Unfortunately, it is also much slower (but it solves the issue mentioned here):
julia> @btime CFTime.datetuple_prolepticgregorian(60000) # based on python's CFTime
6.017 μs (0 allocations: 0 bytes)
(2023, 2, 25)
julia> @btime CFTime.datetuple_prolepticgregorian_old(60000) # Meeus algorithm
0.028 ns (0 allocations: 0 bytes)
(2023, 2, 25)
We might use different algorithms for different date ranges.
Thanks for diving into this! I just tried the same benchmark, on the Julia 1.8 beta release, and the current CFTime master, with 1 small change.
@btime CFTime._Reference.datetuple_prolepticgregorian($(60000))
# 4.557 μs (0 allocations: 0 bytes)
# (2023, 2, 25)
@btime CFTime._Meeus.datetuple_prolepticgregorian($(60000))
# 12.312 ns (0 allocations: 0 bytes)
# (2023, 2, 25)
Still the Meeus algorithm is much faster, but instead of 215_000x faster, it is 370 x faster for me. The Meeus algorithm is slower in my benchmark due to interpolating 60000 to avoid compiling the constant in. And profiling the Reference algorithm I saw that passing this closure around caused dynamic dispatch, leading to allocations. So I removed that argument and called the right method directly in month_lengths
. Though this issue probably didn't exist during your benchmarks since you have no allocations, and your timing was similar.
Do you know applications where 5 μs per date would be an issue? If not, to keep things simple I'd probably just stick with the Reference algorithm.
Thanks a lot for fixing the benchmark runtime issue!
Yes, having two algorithms is not ideal. But the Meeus algorithm should be fixed now with this change:
I tested for all dates between 1000 BC to 4000 AC, and I got now the sames dates for the reference algorithm (updated to avoid allocations) and the Meeus algorithm:
include("/path/to/test/test_time.jl");
Z = CFTime.datenum_prolepticgregorian(-1000,1,1):CFTime.datenum_prolepticgregorian(4000,1,1) # daily
MYMD = CFTime.datetuple_prolepticgregorian.(Z); # Meeus algorithm
# 0.152009 seconds (137.15 k allocations: 49.326 MiB, 8.65% gc time, 57.40% compilation time)
RYMD = Reference.datetuple_prolepticgregorian.(Z); # reference algorithm in test/
# 103.056894 seconds (188.46 k allocations: 51.930 MiB, 0.01% gc time, 0.11% compilation time)
@test MYMD == RYMD # no error
On very long time series (or very high-resolution time series), the speed of the reference algorithm could become an issue, I guess (at least I wished the reference algorithm were faster during testing 😄) . Also the runtime of the reference algorithm increase for dates future away from the start time.
In master, I updated the Meeus algorithm so that it relies entirely on integer arithmetic which makes its accuracy more predicable (overflows could still be a problem). The core-algorithms also now works with Int128
and BigInt
as input.
The reference algorithm in the now used only for testing (so that there is no extract compile time for loading for typical users).
Great that you managed to correct the issue inside the Meeus algorithm! That is indeed the best solution. I'll close the issue since it is now fixed. If you could tag a new release that would be much appreciated.
Describe the bug
The year 300, which is not a leap year, has some odd behavior in the proleptic Gregorian calendar. Instead of going from February 28 to March 1, it goes from February 28 to February 29 to March 2, so March 1 is missing and replaced by a non existing leap day.
To Reproduce
Expected behavior
I expect
DateTimeProlepticGregorian(300, 2, 29)
to error, or if desired create March 1.Environment
CFTime v0.1.1