ERDDAP / erddap

ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source (Apache and Apache-like) Java Servlet from NOAA NMFS SWFSC Environmental Research Division (ERD).
Creative Commons Zero v1.0 Universal
84 stars 59 forks source link

Nanoseconds in relative time units causes error #216

Open benjwadams opened 21 hours ago

benjwadams commented 21 hours ago

Describe the bug Conversion from relative time units containing nanoseconds causes an exception when attempting to convert in ERDDAP's internal routines.

To Reproduce Steps to reproduce the behavior: 1) Add a dataset with time units attribute set to something with nanoseconds, e.g. nanoseconds since 1970-01-01T00:00:00+00:00. 2) Add dataset to datasets.xml to be inventoried by ERDDAP. 3) Dataset does not appear and a Java calendar exception is raised in the ERDDAP log file.

Excerpt from email correspondence where this issue was encountered:

Apparently ERDDAP doesn’t like something with a time related quantity: ``` While trying to load datasetID=dfo-hal1002-20020724T1839 (after 23 ms) java.lang.RuntimeException: datasets.xml error on or before line #731958: ERROR in Calendar2.factorToGetSeconds: units="nanoseconds" is invalid. at gov.noaa.pfel.erddap.dataset.EDD.fromXml(EDD.java:486) at gov.noaa.pfel.erddap.LoadDatasets.run(LoadDatasets.java:364) Caused by: java.lang.RuntimeException: ERROR in Calendar2.factorToGetSeconds: units="nanoseconds" is invalid. at com.cohort.util.Test.error(Test.java:43) at com.cohort.util.Calendar2.factorToGetSeconds(Calendar2.java:2725) at com.cohort.util.Calendar2.getTimeBaseAndFactor(Calendar2.java:2550) at com.cohort.util.Calendar2.getTimeBaseAndFactor(Calendar2.java:2518) at gov.noaa.pfel.erddap.variable.EDVTimeStamp.(EDVTimeStamp.java:186) at gov.noaa.pfel.erddap.variable.EDVTime.(EDVTime.java:27) at gov.noaa.pfel.erddap.dataset.EDDTableFromFiles.(EDDTableFromFiles.java:1778) at gov.noaa.pfel.erddap.dataset.EDDTableFromNcFiles.(EDDTableFromNcFiles.java:131) at gov.noaa.pfel.erddap.dataset.EDDTableFromFiles.fromXml(EDDTableFromFiles.java:501) at gov.noaa.pfel.erddap.dataset.EDD.fromXml(EDD.java:472) ``` After looking at ncdump output, it appears that time variable units are expressed as “nanoseconds since 1970-01-01 00:00:00” or something very similar. I believe this is valid UDUnits and should be fine with CF as well, but ERDDAP doesn’t seem to like converting that unit with the module they have there. If the previous sentence is correct, I’d argue that this is a bug/deficiency of ERDDAP. If you’d like to get the files read in the meantime, I’d recommend converting over to at least seconds.

I have yet to encounter or test any datasets with even smaller time intervals such as picoseconds, but I imagine this issue may also exist with them.

Expected behavior Relative time units expressed in nanoseconds ("nanoseconds since 1970-01-01T00:00:00+00:00") converts properly since it should be a valid UDUnits relative time.

Server

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

ChrisJohnNOAA commented 20 hours ago

Milliseconds is the most fine grained time measurement currently supported in ERDDAP. https://github.com/ERDDAP/erddap/blob/main/WEB-INF/classes/com/cohort/util/Calendar2.java#L5741-L5770

benjwadams commented 2 hours ago

This is surprising behavior because UDUNITS2 allows for all the way down to yocto (10^-24) prefixes on units: https://github.com/Unidata/UDUNITS-2/blob/c83da987387db1174cd2266b73dd5dd556f4476b/lib/udunits2-prefixes.xml#L74

While it's unlikely that the sort of data ERDDAP Is usually serving would have this fine of a time resolution, it is nonetheless, to the best of my knowledge, valid in UDUNITS2 to express something like yoctoseconds since 2024-01-01+00:00.

An acceptable enough fix would be to parse that unit and switch/case a power of ten multiplier to convert the relative time prefix back to milli(time interval) in the event that the Java calendar library threw an an exception.

ChrisJohnNOAA commented 1 hour ago

The ERDDAP documentation for handling time units is here: https://erddap.github.io/setupDatasetsXml.html#timeUnits

It doesn't strictly follow UDUNITS2. ERDDAP doesn't use UDUNITS2 for time conversions. Adding additional time units (even just for time conversion) is possible but more work than a power of ten conversion in one spot. Additionally in several places the java.util.Date library is still used which only has millisecond precision. To properly support nanoseconds we'd need to update that code to a different library or verify it doesn't need the additional precision. Past nanosecond accuracy and standard java time libraries don't support that. Power of ten conversion only support might be possible, but I wonder how beneficial that is, especially if users don't realize they are loosing the precision they think they have.