dankelley / oce

R package for oceanographic processing
http://dankelley.github.io/oce/
GNU General Public License v3.0
144 stars 42 forks source link

read.adv.nortek() error "times in the file are not equi-spaced" #140

Closed edwardpmorris closed 13 years ago

edwardpmorris commented 13 years ago

Issue: fast times calculated by read.adv.Nortek do not account for "gaps" in data collection, when the vector instrument is deployed in "burst-mode". Times are assumed to be equi-spaced, an error is given when running the function and equi-spaced times are returned in oce.adv.Nortek$data$ts$time

edwardpmorris commented 13 years ago

A temporary solution is to calculate correct times using metadata and replace oce.adv.nortek$data$ts$time.

Below is a basic function to do this (which i have tested on 2 Nortek .VEC files).

Notes: Number of measurements per "burst" is not included within metadata. "Samples per burst" (n.per.burst) has to be specified, I extracted this from .HDR file (produced using Nortek vector software).

oce.adv.nortek$metadata$measurementStart already includes 1 second delay for start up of instrument, this 1 second needs to be added per burst i.e. every time the instrument starts up

the number of data: length(oce.adv.nortek$data$ts$time) is lower (tested on 3 files) than given in the Nortek vector .HDR file and winADV (http://www.usbr.gov/pmts/hydraulics_lab/twahl/winadv/) it appears that a small number of data points at the very end of the data are not included in the oce.adv.Nortek object. This does not appear to be a big problem; not sure why difference in data length though.

replacement time vector is shortened to fit length (oce.adv.nortek$data$ts$time) a very small number of data at end of file are excluded

1 second subtracted from oce.adv.nortek$metadata$measurementStart and then "time.delay" added per burst. Is a user defined time.delay useful, i.e., do time delays change between instruments?

Arguments: oce.adv.nortek = an oce adv Nortek object time.delay = 1 second, the time delay due to the insturment starting up after sleep-mode n.per.burst = number of measurments per "burst" (integer)

Value: vector of time series (POSIXct, UTC) which includes "gaps" in data collection, remember to assign to oce.adv.nortek$data$ts$time

FIX ME: issues with time zones? include number of measurements per "burst" (integer) in metadata? some sort of function/document to give an explanation of terms in metadata?

################################################################################

correct.adv.time <- function(oce.adv.nortek, time.delay = 1, n.per.burst) {

## select metadata require to calculate sample times
# time (seconds) between each measurment interval ("burst")
burst.interval <- oce.adv.nortek$metadata$measurementInterval

# total number of measurments (integer)
n.data <- length(oce.adv.nortek$data$ts$time)

# number of measurments per second (integer)
sampling.frequency <- oce.adv.nortek$metadata$burstLength 

# number of bursts (integer)
# number of samples per burst must be supplied (from .HDR) until
# this is included in metadata
n.bursts <- round(n.data / n.per.burst)

# time of first measurment (ISO, UTC)
# 1 second subtracted
# fractional time to 6 decimal points
options(digits.secs = 6)
start.time <- as.POSIXct(strptime(paste(as.character(oce.adv.nortek$metadata$measurementStart  -1),".000000",sep="")
                    ,format="%Y-%m-%d %H:%M:%OS"),tz = "UTC")

## test if "continuous" or "bursts" and calculate time index
# if "continuous"
if (is.na(burst.interval)){

  # set burst index to 1
  burst <- 1

  # calculate seconds from start
  time.sec <- as.numeric((1:n.data * (1 / sampling.frequency)) + time.delay)

  # calculate UTC time
  time.UTC <- (start.time) + time.sec

# if "bursts"
} else {

  # make burst index
  burst <- rep(1:n.bursts, each = n.per.burst, length.out = n.data)

  # calculate seconds from start
  time.sec <- ((burst-1) * burst.interval) + 
    ((rep(1:n.per.burst, times = n.bursts, length.out = n.data) *
    (1 / sampling.frequency))) + time.delay

  # calculate UTC time
  time.UTC <- (start.time) + time.sec

}

# return corrected time
# remember to assign to oce.adv.nortek$data$ts$time
return(time.UTC)

}

dankelley commented 13 years ago

I've started to look at this. One BIG CAUTION: read.adv.nortek() is interpreting the velocity scale for your sample data set as 1e-5 HOWEVER your .hdr file says 1e-4.

Below is some test code. You cannot reproduce it, because the changes I'm making to the .R files are only on my local machine. However, you can run the last line, and it will show you that the item user$velocityScale is wrong (before it dies).

library(oce)
source('~/src/R-kelley/oce/R/adp.nortek.R')
source('~/src/R-kelley/oce/R/adv.R')
d <- read.adv.nortek("~/OCNG CAFNUE 3/OCNG01.vec", from=1, to=80269, debug=3)

The above produces

...
 $ user    :List of 23
   ..$ size                         : int 256
   ..$ T1                           : int 2
   ..$ T2                           : int 16
   ..$ transmitPulseLength          : int 2
   ..$ blankingDistance             : num 0
   ..$ receiveLength                : int 7
   ..$ timeBetweenPings             : int 44
   ..$ timeBetweenBurstSequences    : int 512
   ..$ numberOfBeamSequencesPerBurst: int 1
   ..$ AvgInterval                  : int 32
   ..$ numberOfBeams                : int 3
   ..$ measurementInterval          : int 4800
   ..$ deploy.name                  : chr "OCNG"
   ..$ comments                     : chr "cafnue 3"
   ..$ mode                         : chr [1:2] "00000001" "00010000"
   ..$ velocityScale                : num 1e-05
 ...
dankelley commented 13 years ago

(an aside: note that I just also fixed the velocityScale problem)

dankelley commented 13 years ago

I see now that burst info is stored in the "Vector Velocity Data Header", as described at the top of p25 of System Integrator Guide. The values make sense: I get chunks with 3000 samples each.

dankelley commented 13 years ago

BUG the header tells me that your Vector orientation is "downward". But I am pretty sure you said it was sideways. In any case, I am in the middle of figuring out bursts, so I mention this here just so you can check on it, later. (In other words, I'm leaving it to you to file a bug report, but please wait until I get the bursts working, first.)

dankelley commented 13 years ago

I think I've fixed the bug. Can you check, with the most recent commit (5ccce77aae8c1b36c86f615ecabff59b4ee72557 or later) of the "development" stream.

edwardpmorris commented 13 years ago

Sorry for my naivety (I am very new to this), but in order to check do i need to download the development stream and compile / install, on Windows (Vista), using the Windows toolset via the command line (as the package contains C)? If so, I have been struggling to get this working all morning. In reality using R is my 1st adventure in any type of programming.

Simply copying and running the development stream R scripts adv.nortek.R and adv.R, and then running the function read.adv.nortek() returns an error " can´t find function decodeHeaderNortek"

dankelley commented 13 years ago

On 2011-06-06, at 7:24 AM, edwardpmorris wrote:

Sorry for my naivety (I am very new to this), but in order to check do i need to download the development stream and compile / install, on Windows (Vista), using the Windows toolset via the command line (as the package contains C)? If so, I have been struggling to get this working all morning. In reality using R is my 1st adventure in any type of programming.

Well, that's the preferred way, but see next...

Simply copying and running the development stream R scripts adv.nortek.R and adv.R, and then running the function read.adv.nortek() returns an error " cant find function decodeHeaderNortek"

... I think you just need to put one more line in your test file, the first of what I've got below

source("[DIR]/R/adp.nortek.R") source("[DIR]/R/adv.nortek.R") source("[DIR]/R/adv.R")

and then it should work. (Putting them straight in the test file is handy -- it saves having to run ["source"] them separately.)

PS. from the code you sent, I figured you must be an R expert! So did a colleague, who also noticed the bug report. So, if this is your first programming experience, then clearly you've got talent!!!

edwardpmorris commented 13 years ago

BUG the header tells me that your Vector orientation is "downward" The use of the vector in a sideways (horizontal) orientation is "unconventional" but valid, I guess the instrument has no internal mechanism to decipher the horizontal orientation. Hence, a user correction must be made to the data. Importantly, measurements must be made in xyz and all(?) indications of position (compass, tilt, roll) are invalid. Assuming the probe 1 (x-axis) is orientated upwards, then u and w should be swapped in the resulting data file. (Tip: take care to set the nominal velocities correctly before deployment). The issue is discussed here: http://www.nortekusa.com/en/knowledge-center/forum/current-profilers-and-current-meters/579860304

So the extraction of data is not a bug, however maybe a user option in read.adv."type" (orientation = c("from.header"[default],"horizontal.x.up", horizontal.x.down") ) and then make corrections to the data structure or return error if in ENU?

dankelley commented 13 years ago

So the extraction of data is not a bug, however maybe a user option in read.adv."type" (orientation = c("from.header"[default],"horizontal.x.up", horizontal.x.down") ) and then make corrections to the data structure or return error if in ENU?

Oh, yes, I see. Possibly the existing options to read.adv.nortek() may help you. Look for horizontalCase, etc., and consult the case-by-case [no pun!] table that's in the help page for read.adv(). I can't promise much, but this is worth a try.

You're right -- there does not seem to be anything in the binary file that can allow for a sideways-pointed Vector. I reason this because the orientation is stored in a single bit, and so there are only two choices. (Of course, I could be wrong. I may have missed something in the System Integrator Guide. And also Nortek has lots of internal codes that are not described in that document.)

PS. please be a bit cautious in this, since my code was written for a different geometry. You may very well choose to just manipulate the xyz data (e.g. making y be z, etc.) before converting to ENU.

edwardpmorris commented 13 years ago

Finally managed to get the source compiled and install the developer stream. Without fully understanding, there were some minor complications with untar2 relating to the tar.ball produced by Git (a search provided various solutions that overcome this).

Great! from a quick look, the issue appears mainly solved. The function read.adv.nortek() appears to assign the correct "fast times" (data$time) so that plot.adv() appears correct (in the example data file [OCNG01.vec], P [dbar] has 3 high-water maximums).

Although, one thing, I quickly noticed is the fractional seconds do not appear correctly and the time-delay [1 s] is not added: str(dat$data$time) POSIXct[1:80264], format: "2010-09-07 23:00:00" "2010-09-07 23:00:00" "2010-09-07 23:00:00" ...

Will do some more checking, but it looks good.

Thanks very much for your support and patience. Will look into "horizontalCase" and add an issue if need be.

PS. Thank-you for your kind words, I have been learning R for a couple of years as a tool for my research (simple scripts), but have never had any experience with the inner workings of packages. Now I am inspired!

dankelley commented 13 years ago

On 2011-06-06, at 9:54 AM, edwardpmorris wrote:

Although, one thing, I quickly noticed is the fractional seconds do not appear correctly and the time-delay [1 s] is not added:

I think the fractional seconds are likely just because of the R display settings.

As for the time delay, I'll need to figure out what Nortek calls that, in the "System Integrator Guide". (If you don't have that document, I could email it to you -- it's pretty handy, for learning about the formats.) I just talked with a local acoustics expert, and he wasn't aware that the adv had a time delay ... but we never use burst mode here.

edwardpmorris commented 13 years ago

I confirm fractional seconds seems to be related to display settings, data has correct number of digits and using options(digits.secs = 4) allows you to see it.

I downloaded a copy of the "System Integrator Guide" from here (requires username/password): http://www.nortek.no/lib/manuals/system-integrator-manual_jan2011

I note that you refer to 1. Nortek AS. System Integrator Guide (paradopp family of products). June 2008. (Doc No: PSI00-0101-0608) in read.adv() help. No idea if these documents are significantly different?

The vector timing is explained here: http://www.nortek-as.com/en/knowledge-center/forum/velocimeters/30181049#63704243 http://www.nortek-as.com/en/knowledge-center/forum/hr-profilers/22412372#837297676

My understanding is to calculate the 1st value of fast time (per burst!) we require the T_timestamp [HH:MM:SS] from the "Vector Velocity Data Header" (VV-DH, pg38 SIG) and the "sampling rate" from ?? (no reference found in SIG, but this is akin to sampling frequency, samples per second, which is already extracted in metadata)

Formula derived from Nortek forum (terms adjusted to fit above): T[1st value] = T_timestamp[VV-DH] + 1[second second of the burst. During the second second, no velocity data is collected.] + (1 / (2 * (sampling rate[i.e., sampling frequency, Hz]) ))

I have placed a request for confirmation of this formula on the nortek forum: http://www.nortek-as.com/en/knowledge-center/forum/velocimeters/158319581

We use both continuous and burst mode (depending on the deployment and instrument). Continuous is simply a very long burst. Burst mode is convenient for saving power and memory (older instruments). Also the baseline used to calculate the signal to noise ratio (SNR) is calculated at the start of every burst, as we sometimes have measurements that start in unideal conditions (on the boat or with disturbance) recalculating baseline noise levels maybe useful. Still trying to see if this effectively makes any difference though! Finally, the burst-mode makes a convenient framework by which to split up the data for the calculation of time-ensemble turbulence statistics, which is one of our final aims. In fact, the new structure of the oce.adv object with "$timeBurst" will make calculation of statistics easy.

dankelley commented 13 years ago

On 2011-06-07, at 7:40 AM, edwardpmorris wrote:

My understanding is to calculate the 1st value of fast time (per burst!) we require the T_timestamp [HH:MM:SS] from the "Vector Velocity Data Header" (VV-DH, pg38 SIG) and the "sampling rate" from ?? (no reference found in SIG, but this is akin to sampling frequency, samples per second, which is already extracted in metadata)

Thanks for this very detailed analysis!

One has to be careful about the 'samples per second' in the metadata. The formula used for that is partly a guess (see help(read.adv) in R, and look for the section about nortek).

I've just gone through the 2011 version of the SIG, and don't see a reference to inferring the sampling rate from binary (.vec) files. So, although the formula that I'm using works OK for my data and (I think) your data, we can't really rely on it.

I will post a question on the Nortek board.

dankelley commented 13 years ago

I posted my question about inferring sampling rate to http://www.nortek-as.com/en/knowledge-center/forum/velocimeters/158319583

dankelley commented 13 years ago

A posting on the above-mentioned item on the nortek forum says that I am inferring metadata$samplingRate correctly.

edwardpmorris commented 13 years ago

The reply to my post (above) says the time delay (delay) should be 2 seconds and, as you confirm metadata$samplingRate is fine:

T = T_timestamp[VV-DH] + delay + (1 / (2 * (sampling rate) )

the reply does not refer specifically to the time-stamp derived from "Vector Velocity Data Header" (pg38 System integrator guide), but I assume the ASCI .vhd file provided after conversion is this. I added the .vhd file generated via conversion with the vector software to the OCNG01 example on dropbox, to aid troubleshooting

library(oce)
d <- read.adv.nortek("~/OCNG CAFNUE 3/OCNG01.vec", from=1, to=80269)#, debug=3)
str(d$data$timeBurst)

returns the same times as found in OCNG01.vhd

Upon reading your post on the Nortek site, I am a little confused again and I shall return back to the SIG. At a rough guess I thought there was a VVHD for every burst (with continuous-mode basically a very long burst)?

dankelley commented 13 years ago

Closing, on advice of the initial reporter.