ArgoCanada / argoFloats

Tools for analyzing collections of oceanographic Argo floats
https://argocanada.github.io/argoFloats/index.html
17 stars 7 forks source link

Should we interpret the `MTIME` field? #590

Closed richardsc closed 1 year ago

richardsc commented 1 year ago

I noticed when looking at a recently deployed float (ID=6990512) that it contains a field called MTIME, which based on the nc file corresponds to:

        double MTIME[N_LEVELS,N_PROF]
            long_name: Fractional day of the individual measurement relative to JULD of the station
            _FillValue: 999999
            units: days
            valid_min: -3
            valid_max: 3
            C_format: %.6f
            FORTRAN_format: F.6
            resolution: 9.99999997475243e-07

I presume that this field is also described in the relevant Argo manual (but I haven't looked). It certainly could be converted to a POSIX time (using the "JULD" of the station).

A reprex to get data from this float to examine is:

library(argoFloats)
i <- getIndex()
si <- subset(i, ID='6990512')
p <- getProfiles(si)
d <- readProfiles(p)

Note that because the float starts at the bottom the profile of MTIME goes negative:

image
richardsc commented 1 year ago

I think that the following should be all that is required to convert MTIME to POSIX, providing that startTime in the object corresponds to the "JULD" field mentioned in the attributes:

time <- ctd[['startTime']] + 86400 * ctd[['MTIME']]
dankelley commented 1 year ago

I suspect you're right, but it's odd how they say "Julian Day". There are loads of definitions for Julian Day, so it's tricky ... but the code https://github.com/dankelley/oce/blob/07bf03de57c68b51601c5bba1ca64be105859ce3/R/argo.R#L1417 indicates that the netcdf circumvents the problem of stating which Julian Day is being used!

I looked at the first file in your sequence and see as below

> a<-d[[1]]
> a[["time"]]
[1] "2023-03-09 22:47:40 UTC"
> a[["juld"]]
[1] 26730.95

That confuses me because

> 22+47/60+40/3600
[1] 22.79444

does not end in 0.95 as I would expect. Maybe I'm just confusing myself. but looking at it the other way I get that the time should be 22:48:00 which is 20 seconds off from the stated:

> 0.95*24
[1] 22.8
> 0.95*24 - 22
[1] 0.8
> (0.95*24 - 22)*60
[1] 48

So, 48:00 versus the 47:40 in the other field. Maybe 20 leap seconds explain this. At https://en.wikipedia.org/wiki/Leap_second they list leap seconds. There are more than 20 listed. But maybe there are only 20 seconds since Argo started its definition at some particular date.

This is all a bit rabbit-holey.

richardsc commented 1 year ago

The nc attributes say the Argo reference for JULD is 1950-01-01 00:00:00.

But I guess the question is how do we determine the startTime field from the nc file? (I could look at the code but am on the bus right now)

dankelley commented 1 year ago

Our comments are interlaced. In the comment just before yours, I showed how we compute the start time. (It is called data$time though, not metadata$startTime.)

richardsc commented 1 year ago
    res@data$time <- t0 + res@metadata$juld * 86400

Oh, I see that time is constructed from JULD. So I'm confused as to why they would be different …

dankelley commented 1 year ago

I'm planning to fiddle with this tomorrow. If you're at Dal on Wednesday we could talk about it.

dankelley commented 1 year ago

Mystery solved:

> options(digits=15)
> a@metadata$juld
[1] 26730.9497685185
> a@metadata$juld - 26730
[1] 0.949768518519704
> (a@metadata$juld - 26730) * 24
[1] 22.7944444444729
> ((a@metadata$juld - 26730) * 24 - 22)*60
[1] 47.6666666683741
> (((a@metadata$juld - 26730) * 24 - 22)*60 - 47)*60
[1] 40.0000001024455

Those steps say that the time of day is 22:47:40 which is what we got. So, no contradiction. I was just being stupid, forgetting I have R set up only to have a certain number of digits.

@richardsc if you're going to be at Dal Wed or Fri (CDOGS day) we could chat more about whether to make oce

  1. return [["startTime"]] as the start time
  2. return [["time"]] as a vector of times

or perhaps just store time as a column. We would want to be consistent with CTD handling, I think. It's easier to hash such things out in person, as compared to comments on GH, I think/find.

dankelley commented 1 year ago

One thing to keep in mind -- there may be user code (possible quite a lot of user code since argo has been in oce for a long time) that assumes that [["time"]] will produce just one value.

I am going to write some code to see how many of the .nc files on my compute has MTIME defined. That would be useful.

I am leaning towards just storing @data$mtime as it is (fractional days), if the file contains MTIME. The conversion to get time-of-each-datum is easy and could be documented. That way, any existing code that assumes time is a single value will still work, but users who want more can get it (if it's in the file).

richardsc commented 1 year ago

I think I agree with you re: "time" vs "mtime". For the Argo folks "mtime" will be a known thing, which as you say is easy to handle.

One thought though, as an issue for oce, is that perhaps as.ctd() on an argo object should know how to interpret time (metadata) and mtime to be converted to the standard fields (for a ctd object) of startTime (metadata) and time (data).

dankelley commented 1 year ago

Oh, hang on. We already read MTIME, so the user can do as below. I don't think oce needs changing.

PS to @richardsc do you know why there are so many NA values? They outnumber true values 10:1 for this file.

library(oce)
f <- "R6990512_001.nc"
a <- read.argo(f)
sort(names(a@data))
time <- a[["time"]] + a[["MTIME"]]*86400
# NA values outnumber finite values 10:1 ... WHY?
ok <- is.finite(time)
print(table(ok))
png("01.png")
oce.plot.ts(time[ok], a[["pressure"]][ok], type="o", pch=20, cex=0.7)

01

dankelley commented 1 year ago

I like your idea. (As often seems to be the case, our comments are interlacing.)

richardsc commented 1 year ago

Yes, we do read it -- that's the issue title was "Should we interpret it"? 😄

I think the reason there are NAs is because it doesn't necessarily store the time corresponding to every sample (maybe to save space?). It's actually something I plan to look into (this is quite new firmware on a float type that DFO uses).

richardsc commented 1 year ago

On the interpretation note -- should it be at least lowercased? Or just left alone?

dankelley commented 1 year ago

I also thought about whether to lower-case it. We do that for "standard" things that are known in other objects (e.g. DOXY and similar things all get named oxygen). I want to look in the SBE docs to see if they talk about MTIME there. It rings a bell, but, for the life of me, I am drawing a blank on what the M might stand for.

dankelley commented 1 year ago

Hm. I don't see 'MTIME' (upper- or lower-case) in the SBE docs I have available. And that word (either case) does not appear in oce/R/*R, either. It seems to have a similar meaning to SBE timeJ (in attached, see p 170).

manual-Seassoft_DataProcessing_7.26.8.pdf

dankelley commented 1 year ago

I guess we could call it "mtime". Frankly, I'm divided on this. If the original name is retained (which it ought to be) then the user could do either of

a[["MTIME"]]
a[["mtime"]]

and get the same result.

I am still blanking on what to call that thing. Maybe I need to look in argo docs now. I imagine "M" means something.

richardsc commented 1 year ago

I think the "M" is for "measurement".

See this from the Argo QC manual, that explains the sporadic time returns:

image

dankelley commented 1 year ago

I have about 2000 .nc files in my archive, and I read them all in and see if they have MTIME defined. I found that 67 files had this, and 1863 did not have this. Below is a listing of the files that had it. The last 3 of these files are the ones from the initial comment in this issue.

This makes me think that MTIME is not "standard" enough to get a lower-case name. I could be convinced otherwise, though, if @richardsc has the opposite opinion. (No rush -- I think you'll be in meetings all day and I am having a sign-students-in-to-my-class day.)

``` file hasMTIME 1476 /Users/kelley/data/argo/R2903766_026.nc TRUE 1494 /Users/kelley/data/argo/R3901654_161.nc TRUE 1495 /Users/kelley/data/argo/R3901654_162.nc TRUE 1496 /Users/kelley/data/argo/R3901686_094.nc TRUE 1497 /Users/kelley/data/argo/R3901686_095.nc TRUE 1498 /Users/kelley/data/argo/R3901859_220.nc TRUE 1499 /Users/kelley/data/argo/R3901859_221.nc TRUE 1500 /Users/kelley/data/argo/R3901860_220.nc TRUE 1501 /Users/kelley/data/argo/R3901860_221.nc TRUE 1502 /Users/kelley/data/argo/R3901861_220.nc TRUE 1503 /Users/kelley/data/argo/R3901861_221.nc TRUE 1505 /Users/kelley/data/argo/R3902456_026.nc TRUE 1506 /Users/kelley/data/argo/R3902457_026.nc TRUE 1695 /Users/kelley/data/argo/R4902441_144.nc TRUE 1696 /Users/kelley/data/argo/R4902441_145.nc TRUE 1715 /Users/kelley/data/argo/R4902467_124.nc TRUE 1717 /Users/kelley/data/argo/R4902470_124.nc TRUE 1719 /Users/kelley/data/argo/R4902498_102.nc TRUE 1720 /Users/kelley/data/argo/R4902501_070.nc TRUE 1752 /Users/kelley/data/argo/R4902502_070.nc TRUE 1754 /Users/kelley/data/argo/R4902503_070.nc TRUE 1755 /Users/kelley/data/argo/R4902515_035.nc TRUE 1756 /Users/kelley/data/argo/R4902515_036.nc TRUE 1761 /Users/kelley/data/argo/R4902519_035.nc TRUE 1773 /Users/kelley/data/argo/R4902523_070.nc TRUE 1775 /Users/kelley/data/argo/R4902524_070.nc TRUE 1777 /Users/kelley/data/argo/R4902534_113.nc TRUE 1779 /Users/kelley/data/argo/R4902556_012.nc TRUE 1780 /Users/kelley/data/argo/R4902573_027.nc TRUE 1781 /Users/kelley/data/argo/R4902573_028.nc TRUE 1782 /Users/kelley/data/argo/R4902575_015.nc TRUE 1798 /Users/kelley/data/argo/R4902576_017.nc TRUE 1800 /Users/kelley/data/argo/R4902577_017.nc TRUE 1802 /Users/kelley/data/argo/R4902578_012.nc TRUE 1803 /Users/kelley/data/argo/R4902579_012.nc TRUE 1804 /Users/kelley/data/argo/R4902579_013.nc TRUE 1806 /Users/kelley/data/argo/R4902590_012.nc TRUE 1900 /Users/kelley/data/argo/R6901182_370.nc TRUE 1901 /Users/kelley/data/argo/R6901182_371.nc TRUE 1902 /Users/kelley/data/argo/R6901182_372.nc TRUE 1903 /Users/kelley/data/argo/R6901182_373.nc TRUE 1905 /Users/kelley/data/argo/R6902766_200.nc TRUE 1906 /Users/kelley/data/argo/R6902771_205.nc TRUE 1907 /Users/kelley/data/argo/R6902771_206.nc TRUE 1908 /Users/kelley/data/argo/R6902843_129.nc TRUE 1909 /Users/kelley/data/argo/R6902843_130.nc TRUE 1910 /Users/kelley/data/argo/R6902914_131.nc TRUE 1911 /Users/kelley/data/argo/R6902914_132.nc TRUE 1912 /Users/kelley/data/argo/R6902915_131.nc TRUE 1913 /Users/kelley/data/argo/R6902915_132.nc TRUE 1914 /Users/kelley/data/argo/R6902916_131.nc TRUE 1915 /Users/kelley/data/argo/R6902916_132.nc TRUE 1916 /Users/kelley/data/argo/R6902958_161.nc TRUE 1917 /Users/kelley/data/argo/R6902958_162.nc TRUE 1918 /Users/kelley/data/argo/R6903111_007.nc TRUE 1919 /Users/kelley/data/argo/R6903111_008.nc TRUE 1920 /Users/kelley/data/argo/R6903112_007.nc TRUE 1921 /Users/kelley/data/argo/R6903112_008.nc TRUE 1922 /Users/kelley/data/argo/R6903135_026.nc TRUE 1923 /Users/kelley/data/argo/R6903136_026.nc TRUE 1924 /Users/kelley/data/argo/R6903137_026.nc TRUE 1928 /Users/kelley/data/argo/R6990512_001.nc TRUE 1929 /Users/kelley/data/argo/R6990512_001D.nc TRUE 1930 /Users/kelley/data/argo/R6990512_002.nc TRUE ```
dankelley commented 1 year ago

I have found something in section 2.6.4 of Ref 1:

Screenshot 2023-03-21 at 8 02 37 AM

References

  1. Argo Data Management Team. “Argo User’s Manual V3.4,” January 20, 2021. https://archimer.ifremer.fr/doc/00187/29825/86414.pdf.
dankelley commented 1 year ago

I'm closing this because it relates to oce, not directly to argoFloats. See https://github.com/dankelley/oce/issues/2122 for a discussion of another matter relating to MTIME.

I prefer discussing on oce when it's an oce issue, because I get auto-notified about all new oce issues. Plus, I check oce quite a lot, and don't have a habit of looking for issues here on argoFloats.