Closed EOGrady21 closed 6 years ago
Can you email me an ODF file that causes a problem? I can look this afternoon.
Thanks for the file. Q: do you get the full msg as below? (Filename deleted for privacy.) I want to know whether this second problem is something to do with my setup.
> library(oce)
Loading required package: gsw
Loading required package: testthat
> d <- read.oce("REDACTED.ODF")
Warning messages:
1: In ODFNames2oceNames(ODFnames, ODFunits, PARAMETER_HEADER = NULL, :
unable to interpret ODFunits[7]='GMT', for item named 'time'
2: In read.odf(file, debug = debug) : using first of 2 unique NULL_VALUEs
I could be wrong, but I think the second item (about the 2 unique NULLs) is what prompted @Echisholm21 to open the issue.
Thanks. In "develop" commit 664fe47e2d6974906e86a851ac1ff876b6453733 the first warning won't occur anymore. (I was only permitting one time "unit" because that was the only unit I had noticed in the BIO-based document named BIO_ODF_Parameter_Code_List.pdf on ODF format.)
The second part, about two NULL values, will require some more fiddling with the code. It stems from the oce limitation of only one "missing value" per data file. That is violated in both ODF and Netcdf file formats, so it's something I ought to fix. I may have a look at that sometime before the weekend.
PS to @Echisholm21 in case it's of interest, a great way to find a problem in oce warnings is to do use options(warn=3)
before running the command that gives the warning. That will change the warning into an error. Then, you can do traceback()
to see where the problem lies in the R code. (Possibly you have to build the package locally to get line numbers, but that would be the case if you were trying to fix a problem in a rush.)
Thanks Dan! That's a good tip for poking around in the future. The null values warning was occurring on my end too and @richardsc is right it was the more concerning one to me.
I'm glad the GMT was an easy fix, let me know what happens with the null values, I'd be interested to see what's going on there.
I feel that this is working now in "develop". Below is what I get for your test file (see near the bottom). This is not actually a deep test, because your data file has no bad 'time' values, etc. I think the only way to really test this is to mess with a file in the editor, making up some individualized NULL values for individual data types, and then altering some of the data to be of those values. I think that's worth doing, actually, and I could do it in private build tests quite easily. Heck, I could even do it by direct manipulation of the file oce/inst/extdata/CTD_BCD2014666_008_1_DN.ODF
in the source. I may just do that, tomorrow sometime, and therefore I will leave this issue open for a few days.
> library(oce)
> d<-read.oce("REDACTED.ODF",debug=3)
read.oce("REDACTED.ODF", ...) inferred type="madcp/odf"
read.odf("REDACTED.ODF", ...) {
ODFNames2oceNames() {
STAGE 1 names: EWCT_01 NSCT_01 VCSP_01 ERRV_01 BEAM_01 UNKN_01 SYTM_01
STAGE 2 names: u v w error a unknown time
STAGE 3 names: u v w error a unknown time
STAGE 4 names: u v w error a unknown time
ODFnames[1]='EWCT_01', names[1]='u', ODFunits[1]='M/s'
ODFnames[2]='NSCT_01', names[2]='v', ODFunits[2]='M/s'
ODFnames[3]='VCSP_01', names[3]='w', ODFunits[3]='M/s'
ODFnames[4]='ERRV_01', names[4]='error', ODFunits[4]='M/s'
ODFnames[5]='BEAM_01', names[5]='a', ODFunits[5]='counts'
ODFnames[6]='UNKN_01', names[6]='unknown', ODFunits[6]='counts'
ODFnames[7]='SYTM_01', names[7]='time', ODFunits[7]='GMT'
} # ODFNames2oceNames()
oce names: u v w error a unknown time
NAvalueList= c(u = -1e+06, v = -1e+06, w = -1e+06, error = -1e+06, a = -1e+06, unknown = -1e+06, time = -99)
set 3 values in 'u' to NA, because they matched the NULL_VALUE (-1e+06)
set 3 values in 'v' to NA, because they matched the NULL_VALUE (-1e+06)
set 3 values in 'w' to NA, because they matched the NULL_VALUE (-1e+06)
set 2 values in 'error' to NA, because they matched the NULL_VALUE (-1e+06)
set 0 values in 'a' to NA, because they matched the NULL_VALUE (-1e+06)
set 0 values in 'unknown' to NA, because they matched the NULL_VALUE (-1e+06)
set 0 values in 'time' to NA, because they matched the NULL_VALUE (-99)
} # read.odf()
PS. (note to self) -- I've not yet done this on odf2oce
, but rather only in read.odf()
.
Good idea on changing some NULL values in the built-in ODF file to do tests -- I was thinking the same thing.
It is weird that there is a different NULL value for just one of the fields, but since they are setup like that I think it makes sense to have the code be robust in case we encounter it in other files.
FTR, for the time unit warning, I've never seen the unit
field used for timezone before. I suppose it makes sense (because there likely isn't another TZ parameter), but strictly speaking GMT is not a "unit" ...
@Echisholm21 I think things are working now in the "develop" branch commit d682d2ca410fea52e83fef552ac140127696a88a. This is quite thoroughly tested in tests/testthat/test_odf.R
, which deals with a modified ODF file. (I won't go into a lot of detail here because the code speaks for itself.) Also, I compressed that file and the original version. Please have a look, and make a decision on closing the issue.
@richardsc I'm calling it a "unit" just because the ODF file names it as such. I don't think this will matter in any situation, but I could of course remove the unit if desired. Just let me know, by email or a new issue ... obviously it would just take a few seconds to do (and then 20 minutes for the automatic build/test sequence I like to undertake before committing to GH).
I agree with you calling it a "unit" based on what's in the file. My point was more that I don't think BIO should be specifying the unit for a "time" field as the timezone. Not related to oce
-- so build, test, and push away!
Looks great! Thanks so much!
Short summary of problem
When reading in archived ODFs from BIO database I am consistently getting a warning about Null values and sometimes accompanied by warnings about GMT.
Details (optional)
The warning I get looks like :
@richardsc This is from the odf2adp script
What you expected to happen
Not sure if this is an 'issue' exactly but I am curious about the null value warnings. I understand the GMT warning is likely just due to the function looking for 'UTC' and not understanding GMT but that is not really a concern since they are (from my understanding) the same.
How urgent is this?
This is not urgent at all.
Output from sessionInfo()
R version 3.4.4 (2018-03-15) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] abind_1.4-5 ncdf4_1.16 oce_0.9-24 gsw_1.0-5 testthat_2.0.0
loaded via a namespace (and not attached): [1] compiler_3.4.4 magrittr_1.5 R6_2.2.2 tools_3.4.4 yaml_2.1.19 Rcpp_0.12.16 rlang_0.2.0