Open CharlesBlais opened 4 years ago
On the CDF page you find the basic description of what CDF is able to do. One of his advantages is to work with Leap Seconds. For this it has introduced a datetime type called CDF_TIME_TT2000. Internally timestamps are stored as ns from the 2000 January 1, 12h Terrestrial Time (TT). TT can be converterd to UTC by
deltaAT is the sum of the leap seconds since 1960. To be able to calculate this timestamp the used software needs to be aware of all the leap seconds upon the timestamp you want to create.For example creating the date 2019/01/01 the software needs to be aware of all leap seconds before 2019/01/01. To be able to check how the software is doing that I looked closer to two API :
Both work with a table that you can find here https://cdf.gsfc.nasa.gov/html/CDFLeapSeconds.txt. It is basically a text file containing all leap seconds up till now. Naturally this file will change everytime that a new leap second is introduced and that is exactly something that has to be handled with care because creating files with an out of date CDFLeapSeconds.txt cause timestamps to be created with some seconds shifted.
Currently I tested the C code version V3.6.1 and V3.7.1 I also tested the use of the python Library CDFLib.
In the CDF package the CDFLeapSeconds.txt is hardcoded
When you want to externalise ( preferred way to have control of the file ) you can however
export CDF_LEAPSECONDSTABLE='./CDF/cdf36_1-dist/CDFLeapSeconds.txt'
When downloading a CDF file on the intermagnet ftp and read it with CDF3.6.1 it will work without any problems hinting that the file was created with the CDFLeapSeconds until 2015. This also means that leap second 2017 wasn't taken into account during file creation and will give problems with all files containing timestamps beyond 2017/1/1. When using CDF3.6.1 and pointing the environment variable CDF_LEAPSECONDSTABLE to a more recent CDFLeapSeconds.txt results in
ERROR> TT2000_USED_OUTDATED_TABLE: A TT2000 data is either invalid
(made with an oudated leap second table) or trying to use an outdated leap second table.
(bdv_20190203_000000_pt1s_1.cdf)
After changing everything to version 3.7.1 ( currently the most recent one). The error message disappeared and you can read the file however wrongly because all timestamps are shifted with one second.
Record # 86398: 2018-06-23T23:59:56.000000000
Record # 86399: 2018-06-23T23:59:57.000000000
Record # 86400: 2018-06-23T23:59:58.000000000
instead of correct reading
Record # 86398: 2018-06-23T23:59:57.000000000
Record # 86399: 2018-06-23T23:59:58.000000000
Record # 86400: 2018-06-23T23:59:59.000000000
The behaviour which you see with 3.7.1 is the same if you use the python lightweight library CDFLib 0.3.15. ( which uses an externalised CDFLeapSeconds.txt). No warning but wrong readings of essentially a file that was wrongly created.
So with this tests we can conclude that :
In the Lightweight library there is a method that can be used to find out version of CDFleapSeconds.txt used during creation of the file. There is a method cdf_info() that eturns a dictionary that shows the basic CDF information. This information includes
file_name = "her_20190415_000000_pt1s_1.cdf"
cdf_file = cdflib.CDF('./files/'+file_name)
print("LeapSecondsTable on file creation : "+ str(cdf_file.cdf_info()['LeapSecondUpdated']))
# second method prints out theleapsecondfile you are using in the API to interprete the file
print(cdflib.cdfepoch.getLeapSecondLastUpdated())
....
LeapSecondsTable on file creation : 20150701
Leap second last updated: 2017-1-1
These to methods can be used to correct warnings and detect errors with leap seconds.
Hi @stephanbracke and @SimonFlower, is the correction of CDF leap second still a matter that needs attention?
I haven't looked at this for a while, but I think it's down to me to update the CDF code on our GIN to access the correct leap second table. Once I'd done this, we'd need to update all the CDF files in the archive at NRCan, so I'm thinking it may make sense not to do this work until the archive has moved to BGS. That would make considerably less work (not having to re-transfer the CDF data ).
In software we could build in checks to verify if we have the latest one.
MagPy includes a check on actuality of the leap second table since version 0.9.3 (Python3 version making use of CDFlib) according to Stephans suggestions.
Would everyone say this issue is resolved?
In magpy it is integrated and solved, but for the moment I think it is still wrongly created on the ftp site ( I checked for realtime data). But as Simon already stated in previous post he will do this when everything is moved to BGS.
I have not fixed this issue in the Edinburgh GIN software (my apologies), which is what creates a large amount of data on the NRCan website. So can we keep the issue open please.
I have updated the Edinburgh GIN with a new CDF library that I think fixes the problem. Since all CDF files distributed to users through the web site are generated "on the fly" in response to user requests and are not stored or cached, I think this resolves the problem for the Intermagnet web site.
The Edinbugh GIN's ftp server also generates CDF files "on-the-fly". I've checked the software that creates these CDF files (a different piece of software to the web site) and again the problem is fixed. There is a problem with the ftp server downloading large files, which is a separate issue.
So I think this issue could be closed now.
Stephan Bracke identified a problem with leap second in the format. After discussion, it was identified that the problem is with the leap second table text file and requires an update to the NASA CDF by recompiling the library. If all can contribute more on the details on problem and how to resolve.