dankelley / oce

R package for oceanographic processing
http://dankelley.github.io/oce/
GNU General Public License v3.0
142 stars 42 forks source link

bug in read.adp.rdi for oce version > 0.9-20 #1228

Closed richardsc closed 7 years ago

richardsc commented 7 years ago

I've been looking at some RDI files recently, and I noticed a difference between what I was getting in R and what I was seeing from other software (including RDI software and matlab).

Specifically, the newest version of oce was finding one less ensemble than I knew to be in the file, and worse there were approximately 50 or so ensembles at the end of the file that were just filled with NAs.

I used devtools and install_version() to go back to the previous CRAN version, 0.9-20, and it works as expected. From this page: https://cran.rstudio.com/src/contrib/Archive/oce/ it looks like whatever caused this came after Nov. 19 2016.

I'm guessing that it has to do with the ldc_rdi*.c code that we'd been changing, perhaps for some of the sentinel-related work?

I'll provide a file over email.

dankelley commented 7 years ago

This is ringing a bell. Wasn't there some code (written by you?) that skipped the first ensemble, or used it only to get header info? My memory is hazy.

richardsc commented 7 years ago

Hm ... I'm also hazy on that ... you might be right.

However, the bigger issue is that the newest version is missing data from a whole bunch of ensembles at the end of the file.

I just email the file I'm using. I'll be looking at this some, but my head isn't as "into" the C code so you might have a better sense of what to do there.

dankelley commented 7 years ago

Can you provide something that will tell me how you know things are missing? That would help me to diagnose.

Also, do you consider the byte locations of the file to be private, or would it be ok for me to post them in this issue?

dankelley commented 7 years ago

One thing: the code is complaining that it got to the end of the file:

library(oce)
d<-read.oce("~/Dropbox/oce-1228/BSTR_RDI_2014.000")

spits out (at end of warnings)

10: In read.adp.rdi(file, processingLog = processingLog, ...) :
  got to end of file
dankelley commented 7 years ago

I instrumented the code (not pushed) to export ldc at R/adp.rdi.R line 675 to a global variable called dan. @richardsc, can you comment on these? (For example, does this tell us that we have the right final times but are missing 10 ensembles ... or something?) I'll also spit out the top ones below the divider.

> str(dan)
List of 4
 $ ensembleStart: int [1:8323] 1 746 1491 2236 2981 3726 4471 5216 5961 6706 ...
 $ time         : int [1:8323] 1407556803 1407564003 1407571203 1407578403 1407585603 1407592803 1407600003 1407607203 1407614403 1407621603 ...
 $ sec100       : raw [1:8323] 00 01 01 02 ...
 $ outbuf       : raw [1:6200635] 7f 7f e3 02 ...
> tail(dan$ensembleStart)
[1] 6196166 6196911 6197656 6198401 6199146 6199891
> tail(numberAsPOSIXct(dan$time))
[1] "2016-07-02 06:00:37 UTC" "2016-07-02 08:00:37 UTC" "2016-07-02 10:00:37 UTC" "2016-07-02 12:00:37 UTC" "2016-07-02 14:00:37 UTC"
[6] "2016-07-02 16:00:37 UTC"

> head(dan$ensembleStart)
[1]    1  746 1491 2236 2981 3726
> head(numberAsPOSIXct(dan$time))
[1] "2014-08-09 04:00:03 UTC" "2014-08-09 06:00:03 UTC" "2014-08-09 08:00:03 UTC" "2014-08-09 10:00:03 UTC" "2014-08-09 12:00:03 UTC"
[6] "2014-08-09 14:00:03 UTC"
richardsc commented 7 years ago

Yes, we are getting the correct final times, but in the matrix that comes from reading the data chunks we are only getting NAs

richardsc commented 7 years ago

Also, I saw that "end of file" warning ... That is likely indicative of the problem, since it doesn't occur in the older versions of oce

dankelley commented 7 years ago

Prescript -- I will keep editing this comment in-place, as I take notes, instead of making new comments. That's to avoid driving followers crazy with emails, and also to keep this page somewhat clean.

I've started work by writing test code in the issues repo https://github.com/dankelley/oce-issues/tree/master/12xx/1228. @richardsc see directions in the README.md at that site for a way to insert data files (so you can test) without making those files (or even their names) public.

Also, that README file will get updated as I make notes on what I find. So far, all I've done is verify that the times match up (after removing the first matlab ensemble), that the pressures are almost matching (after using a constant offset) and that the velocities are identical except when they, um, aren't. For one thing, matlab is using 2^15 to indicate "NA", from what I can tell. But below are what I am seeing, after doing some detective work near the end to see where things went off the rails.

                       toce uoceTOP umatTOP
8276 2016-06-28 18:00:36.91   0.172   0.172
8277 2016-06-28 20:00:36.92   0.154   0.154
8278 2016-06-28 22:00:36.93   0.056   0.056
8279 2016-06-29 00:00:36.93   0.012   0.012
8280 2016-06-29 02:00:36.93   0.045   0.045
8281 2016-06-29 04:00:36.93   0.052   0.052
8282 2016-06-29 06:00:36.93      NA   0.040
8283 2016-06-29 08:00:36.93      NA   0.023
8284 2016-06-29 10:00:36.93      NA   0.095
8285 2016-06-29 12:00:36.94      NA   0.111
8286 2016-06-29 14:00:36.95      NA   0.236
dankelley commented 7 years ago

Q for @richardsc what do you think is the coord system here? (Answer by email if you prefer.) I ask because the names of the elements in the matlab file suggest ENU but my oce is getting SFM. I'm not terribly worried about this -- it's not part of this bug anyway -- but I will be translating all bit-detection work from byteToBinary to rawToBits at some point, and I'll want it to work for all the files I have in my possession, including this one (temporary possession, for this one).

dankelley commented 7 years ago

Q for @richardsc is this a SentinelV file? I'm not seeing that in my bit-checking. (A lot of the code that was changed recently was for SentinelV, I think.)

dankelley commented 7 years ago

I'm making progress. The buffer returned by the C function ldc_rdi_in_file is larger than the file buffer. It was stopping early because it was comparing a pointer to the file size, not the buffer size. I tried a simple 2-line change and all works now. However, I don't like this "solution" because the buffer should be identical to the file, and having it different suggests some muddy coding (e.g. allocating 3 or 4 extra unused bytes per profile would explain this). I won't push to GH until I figure out how to make the buffer line up, or get frustrated and just use a buffer with some extra (unused) bytes in it.

richardsc commented 7 years ago

Coordinate system is SFM. I think the RDI software just uses ENU for the export no matter what the file is.

And no, this is not a sentinel file.

dankelley commented 7 years ago

@richardsc do you know how the instrument gets turned off? Is it a matter of doing something that does a software power-down (a bit like "shutdown" in a unix system) after completing the final ensemble? Or maybe the operator presses a button and the machine stops recording without finalizing the ensemble?

I'm sort of guessing that it might be the second alternative, because I'm finding a few hundred extra bytes on the end. I could be wrong on that, though.

I don't actually worry a lot about 488 bytes in 6 million bytes, but I don't like to have mysteries.

Most likely I'll push something to GH late tomorrow aft.

richardsc commented 7 years ago

I'm sort of guessing that it might be the second alternative, because I'm finding a few hundred extra bytes on the end. I could be wrong on that, though.

I'm guessing on this instrument (because I didn't deploy and recover), but I think the likely scenario for this instrument is that it simply ran out of batteries. So, I would not at all be surprised if it had a partial ensemble at the end (though I don't know the firmware details of how it handles a power cycle while doing data collection).

dankelley commented 7 years ago

I've made some progress on this, and I've pushed to develop, but I am not done, and I think this issue should be left open. I'm also starting a new issue on the need to recode some of the guts of read.adp.rdi() and it's foundational C code.

dankelley commented 7 years ago

I think this is fixed now, in the branch named issue_1229. Perhaps you can check on your data? Tomorrow (morning, likely) I'll add more elements to the test suite. I may look at the RDI website to see if they provide a sample file, and if they do, I'll put it in the test suite for oce. (The file can go in the github, so it can get tested in the developer test suite, but I won't put it into oce, so it won't be tested on the CRAN tests)

dankelley commented 7 years ago

I think this is ok now in develop commit 9e6cff7c1c41d0cad5aa5b9e175a6f5690432178

Maybe you could check?

Note: I have put several tests in the oce issues repo 12xx/1228 and I'll put such tests in the overall test suite, once I can find an open-access rdi file.

richardsc commented 7 years ago

Looks good. Thanks!