Closed richardsc closed 1 year ago
NB: this is from a "glider" AD2CP instrument, which the code seems to recognize, e.g. if you read the Data_avgd.adcp
file:
d <- read.adp.ad2cp('Data_avgd.ad2cp', dataType='average')
str(d)
Formal class 'adp' [package "oce"] with 3 slots
..@ metadata :List of 24
.. ..$ units :List of 2
.. .. ..$ v :List of 2
.. .. .. ..$ unit : expression(m/s)
.. .. .. ..$ scale: chr ""
.. .. ..$ distance:List of 2
.. .. .. ..$ unit : expression(m)
.. .. .. ..$ scale: chr ""
.. ..$ flags : list()
.. ..$ oceCoordinate : chr "beam"
.. ..$ orientation : chr [1:701] "zup" "zdown" "zdown" "zdown" ...
.. ..$ blankingDistance : num 0.2
.. ..$ cellSize : num 2
.. ..$ configuration : logi [1:16] TRUE TRUE TRUE TRUE FALSE TRUE ...
.. ..$ datasetDescription : int [1:699] 17185 17185 17185 17185 17185 17185 17185 17185 17185 17185 ...
.. ..$ distance : num [1:15] 2.2 4.2 6.2 8.2 10.2 12.2 14.2 16.2 18.2 20.2 ...
.. ..$ numberOfBeams : int 4
.. ..$ numberOfCells : num 15
.. ..$ originalCoordinate : chr "beam"
.. ..$ filename : chr "/Users/richardsc/Downloads/NortekData/102878/Data_avgd.ad2cp"
.. ..$ powerLevel : int [1:701] 61 -100 -100 -100 -100 -100 -100 -100 -100 -100 ...
.. ..$ status : raw [1:32, 1:701] 01 01 01 00 ...
.. ..$ activeConfiguration: int [1:701] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ manufacturer : chr "nortek"
.. ..$ fileType : chr "AD2CP"
.. ..$ serialNumber : int 102878
.. ..$ header : chr [1:41] "GETCLOCKSTR,TIME=\"2022-01-18 05:09:04\"" "ID,STR=\"Glider\",SN=102878" "GETHW,FW=3055,FPGA=185,DIGITAL=\"I-3\",INTERFACE=\"H-4\",ANALOG=\"G-1\",SENSOR=\"I-0\",BOOT=23,FWMINOR=10" "BOARDSENSGET,AV=23,NB=4,HF=1000,TTR=2.0000,TTRB5=0.0000,TTRB5AUX=0.0000,AUXRS=0" ...
.. ..$ type : chr "Glider"
.. ..$ declination : num 0
.. ..$ frequency : num 1000
.. ..$ dataType : num 22
..@ data :List of 24
.. ..$ nominalCorrelation : int [1:699] 78 78 78 78 78 78 78 78 78 78 ...
.. ..$ ensemble : int [1:699] 2 2 2 2 2 2 2 2 2 2 ...
.. ..$ time : POSIXct[1:699], format: "2022-01-18 05:09:13" "2022-01-18 05:09:28" ...
.. ..$ soundSpeed : num [1:699] 1526 1526 1526 1526 1526 ...
.. ..$ temperature : num [1:699] 21.8 21.8 21.7 21.8 21.8 ...
.. ..$ pressure : num [1:699] 0.806 0.795 0.843 0.803 0.816 0.817 0.856 0.83 0.809 0.803 ...
.. ..$ heading : num [1:699] 267 267 281 280 280 ...
.. ..$ pitch : num [1:699] 0.55 0.55 0.46 0.46 0.45 0.47 0.46 0.46 0.45 0.45 ...
.. ..$ roll : num [1:699] 4.62 4.61 -7.98 -8.01 -8.01 -8.02 -8 -8 -8.01 -8 ...
.. ..$ magnetometer :List of 3
.. .. ..$ x: int [1:699] -19 -19 39 37 36 37 37 37 37 37 ...
.. .. ..$ y: int [1:699] 313 312 155 154 153 154 154 154 154 154 ...
.. .. ..$ z: int [1:699] 444 444 466 468 468 468 468 468 467 468 ...
.. ..$ accelerometer :List of 3
.. .. ..$ x: num [1:699] 0.00977 0.00977 0.00824 0.00812 0.008 ...
.. .. ..$ y: num [1:699] -0.0807 -0.0806 0.139 0.1395 0.1395 ...
.. .. ..$ z: num [1:699] -0.996 -0.996 -0.991 -0.99 -0.99 ...
.. ..$ temperatureMagnetometer: num [1:699] 0.625 0.5 0.5 0.5 0.5 0.5 0.5 0.625 0.5 0.5 ...
.. ..$ temperatureRTC : num [1:699] 22.5 22.5 22 22.2 22.2 ...
.. ..$ transmitEnergy : int [1:699] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ powerLevel : int [1:699] -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 ...
.. ..$ v : num [1:699, 1:15, 1:4] -32.8 -32.8 -32.8 -32.8 -32.8 ...
.. ..$ a : raw [1:699, 1:15, 1:4] 3e 3f 3e 3e ...
.. ..$ q : raw [1:699, 1:15, 1:4] 04 04 06 06 ...
.. ..$ percentgood : int [1:2796] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ stdDevPitch : num [1:699] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ stdDevRoll : num [1:699] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ stdDevHeading : num [1:699] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ stdDevPressure : num [1:699] 0 0 0 0 0 0 0 0 0 0 ...
.. ..$ stdDev : num [1:699] 0 0 0 0 0 0 0 0 0 0 ...
..@ processingLog:List of 2
.. ..$ time : POSIXct[1:1], format: "2023-03-02 11:57:35"
.. ..$ value: chr "read.adp.ad2cp(file=\"/Users/richardsc/Downloads/NortekData/102878/Data_avgd.ad2cp\", from=1, to=701, by=1)"
It might help to run the read
with debug=3
. I'm doing something else right now, but I do have a memory about wondering the rules on 'text' elements. I think in the file I was looking at, it was always at the start. Maybe this has a text element that is not at the start. Just guessing.
Yes, I tried that -- the debug info is nice but because it spits out a summary of every ensemble in the file the stuff at the top gets lots and it's only possible to see what's at the bottom, e.g.:
} # do_ldc_ad2cp_in_file()
length(d$index): 1567
using to=1567 based on file contents
focussing on 1567 data records
In read.adp.ad2cp() : setting plan=0, the most common value in this file; 0 occurs 1559 time[s]; 1 occurs 8 time[s]
this file has a header at id=1398484485514531539540541580613742855106011671168150915101567
this plan has 1559 data records, out of a total of 1567 in the file subset
N=1559
focussing on 1559 data records (after subsetting for plan=0)
below is table() of the 'plan' values in this subset of the file:
activeConfiguration
0
1559
commonData$configuration DIFFERS from the first value, in 1 instances for chunk key 0xa0 (0xa0=text)
Error in (function() { :
Variable "0xa0=text" configuration detected, so expect erroneous results. Please submit a bug report.
It might be good to change the debug level for spitting out chunk information so that at say debug=1
it doesn't output quite so much.
Good point on the value of debug
. NOTE: if you alter any code, please do it in a separate branch. We don't want to risk getting things mixed up prior to a CRAN release. I'm doing argoFloats stuff right now and so can't really look into this ad2cp thing. The code is messy, I think.
No problem. I've created a (local only for now) branch called issue2048
(nice power of 2!). If I make any signifcant changes I'll push to get you to take a look.
Like Dan, I'll use this issue to take notes as I learn things.
The part of the code that triggers the error is this:
# The vectorization scheme used in this function assumes that configurations
# match within a given ID type. This seems like a reasonable assumption,
# and one backed up by the impression of a Nortek representative, but I do
# not see definititive statement of the requirement in any documentation
# I've studied. Since we *need* this to be true in order to read the data in
# vectorized way, we *insist* on it here, rather than trying to catch
# problems later. Use local() to avoid polluting namespace.
local(
{
for (id in as.raw(unique(d$id))) {
config <- commonData$configuration[d$id==id, ]
if (is.matrix(config)) {
ok <- TRUE
for (col in seq(2L, ncol(config)))
ok <- ok && all(config[, col] == config[1, col])
if (!ok) {
oceDebug(debug, "commonData$configuration DIFFERS from the first value, in ",
sum(!ok), " instances for chunk key 0x", as.raw(id), " (", ad2cpCodeToName(id), ")\n")
stop("Variable \"", ad2cpCodeToName(id), "\" configuration detected, so expect ",
"erroneous results. Please submit a bug report.")
}
}
}
}
)
which is a bummer because it suggests that perhaps our assumption of all configurations within a dataType being the same isn't a good one.
But, the fact that the code is finding a "text" configuration within an "average" dataID seems weird to me ... I need to deep dive into the code (and the manual) more to figure this out.
Hm, a lot of the columns of the commonData$configuration
matrix have data that differ from the values in the first row. I've instrumented the code (not pushed) and am including a snapshot. Notice that columns 1, 5, 10 and 11 are all flagged as having data that vary between some rows.
I do not know what to make of this. Then again, I wrote this code 10 months ago (maybe more; I'm just looking at git blame) and (a) the code is long and complex, plus (b) we were really guessing about quite a lot of things.
@richardsc My guess is that it may be quite hard to debug this, without documentation from nortek. I remember a lot of back-and-forth emails last year, sometimes receiving conflicting advice from Nortek folks. My impression was that the documentation was a work-in-progress. I wonder if they have something available now?
I still stand by our logic in switching to a read-one-data-type mode. The code before that was much more complicated and I think the user interface was much worse than the one we have now, because with what we had before, all related generic functions were broken. I'm glad @richardsc suggested the read-one-data-type pathway.
For fun, I converted the stop()
to a warning()
(not pushed to GH) to see if it would complete, or go haywire. It did complete. The results are as below. I've attached a gzipped form of the rda, in case @richardsc wants to try looking at the output. (I don't really know what to expect.)
> d
adp object, from file '/Users/kelley/Downloads/NortekData/102878/Data.ad2cp', has data as follows.
nominalCorrelation [1:1404]: 78, 78, ..., 78, 78
ensemble [1:1404]: 1, 2, ..., 1, 2
time [1:1404]: 2022-01-18 05:09:13.063, 2022-01-18 05:09:14.063, ..., 2023-03-01 20:03:58.063, 2023-03-01 20:03:59.063
soundSpeed [1:1404]: 1526.4, 1526.4, ..., 1510.1, 1510.1
temperature [1:1404]: 21.84, 21.84, ..., 16.11, 16.11
pressure [1:1404]: 0.807, 0.804, ..., 0.732, 0.769
heading [1:1404]: 267.22, 266.78, ..., 244.15, 244.16
pitch [1:1404]: 0.56, 0.54, ..., 0.41, 0.41
roll [1:1404]: 4.63, 4.62, ..., -10.69, -10.70
magnetometer, a list with contents:
x [1:1404]: -18, -20, ..., -71, -71
y [1:1404]: 312, 313, ..., 75, 75
z [1:1404]: 444, 444, ..., 365, 365
accelerometer, a list with contents:
x [1:1404]: 0.0098877, 0.0096436, ..., 0.0072021, 0.0072021
y [1:1404]: -0.080688, -0.080688, ..., 0.18567, 0.18567
z [1:1404]: -0.99603, -0.99652, ..., -0.98273, -0.98248
temperatureMagnetometer [1:1404]: 0.625, 0.625, ..., -2.125, -2.125
temperatureRTC [1:1404]: 22.5, 22.5, ..., 17.5, 17.5
transmitEnergy [1:1404]: 0, 0, ..., 14, 0
powerLevel [1:1404]: -100, -100, ..., -100, -100
v, a 1404x15x4 array with value 0.011 at [1,1,1] position
a, a 1404x15x4 array with value 3e at [1,1,1] position
q, a 1404x15x4 array with value 05 at [1,1,1] position
I started a new branch for experiments, named ad2cp_2023_03
. At commit 5e1013304f72a08628a8b3d5f5a70fc6666f37a4 if I run
library(oce)
f <- "~/Downloads/NortekData/102878/Data.ad2cp"
d <- read.adp.ad2cp(f, dataType="average", debug=1)
I get as follows (I'm selecting just part of the debugging info). It is only the text
ID that has inconsistencies.
Perhaps @richardsc and I can f2f brainstorm on this issue sometime this or next week.
Debugging output
Checking commonData$configuration consistency within columns {
checking id 0xa0=text
column 2 has inconsistencies in 3 of the 11 rows
column 5 has inconsistencies in 6 of the 11 rows
column 10 has inconsistencies in 6 of the 11 rows
column 11 has inconsistencies in 9 of the 11 rows
checking id 0x16=average
no inconsistencies in any column
checking id 0x17=bottomTrack
no inconsistencies in any column
summary: commonData$configuration inconsistencies for 1 ID type(s)
} finished checking commonData$configuration consistency
Warnings
Warning messages:
1: In (function() { :
id 0xa0=text column 2 has inconsistencies in 3 of the 11 rows
2: In (function() { :
id 0xa0=text column 5 has inconsistencies in 6 of the 11 rows
3: In (function() { :
id 0xa0=text column 10 has inconsistencies in 6 of the 11 rows
4: In (function() { :
id 0xa0=text column 11 has inconsistencies in 9 of the 11 rows
5: In (function() { :
id 0xa0=text has non-uniform commonData$configuration within 4 columns
6: In (function() { :
Found commonData$configuration inconsistencies for 1 ID type(s)
Thanks for looking into this! I got a little into the code, and considered doing the same as you did to change the stop()
to a warning()
just to see what happens, but other fires came up.
In the meantime I've been working on trying to get a copy of the Nortek software to allow us to have something to compare against what oce is doing. I've also got some more test files that have other odd behaviour, that I've started looking at the manual for again.
But yes, a f2f would be a good way to go through some of this!
For checking, I am putting below (click Details) the output from
library(oce)
f <- "~/Downloads/NortekData/102878/Data.ad2cp"
d <- read.adp.ad2cp(f, dataType="average", debug=1)
summary(d)
str(d, 3)
It might be good to go through a checklist in a meeting (note: I will edit this in-place as more ideas come up)
velocityFactor
is 1e78, but that might be from a text record because later on we get sensible-seaming velocities.I think this is fixed. I'm going to go ahead and close it (with notes as below, used in closing several issues in the past few minutes), with the understanding that C might want to reopen it next week.
Done in "develop" commits af580735d9c92978ef512ac4b29439dcb53b88f9 (just a PR) and f5eab49f88708a2308090246296489919636773c. The latter is a big change, reflecting offline development over 6 days, and with git diff --stat ending with
102 files changed, 3263 insertions(+), 4715 deletions(-)
This dataset produces an odd error when trying to read with
read.adp.ad2cp()
:I'll poke at this when I get a chance once I'm at work, but thought I'd put it here first in case @dankelley has any ideas.
NortekData.zip