dankelley / oce

R package for oceanographic processing
http://dankelley.github.io/oce/
GNU General Public License v3.0
143 stars 42 forks source link

read.ctd.sbe not recognizing temperature channel #941

Closed richardsc closed 8 years ago

richardsc commented 8 years ago

I have a file, emailed to you, that gives the error:

Error in read.ctd.sbe("S262-023-CTD.cnv", debug = 10) : 
  cannot find 'temperature' in this file

Examination of the cnv file shows that there is a temperature, which uses the Key tv290C:

# name 0 = depSM: Depth [salt water, m]
# name 1 = c0mS/cm: Conductivity [mS/cm]
# name 2 = tv290C: Temperature [ITS-90, deg C]
# name 3 = prdM: Pressure, Strain Gauge [db]
# name 4 = scan: Scan Count
# name 5 = flSP: Fluorescence, Seapoint
# name 6 = sbeox0Mm/Kg: Oxygen, SBE 43 [umol/Kg]
# name 7 = par: PAR/Irradiance, Biospherical/Licor

Looking at the table from the back of the SBE data processing manual (e.g. linked in #929), I see that is a valid key for a primary temperature sensor:

t090Cm, t4990C, tnc90C, or tv290C | Temperature [ITS-90, deg C] | t 90 C | ITS-90, deg C | 1st sensor

so I think it makes sense that oce should recognize it. Currently it is missing it because the grep regex is:

    } else if (1 == length(grep("^t[0-9]90C$", name, ignore.case=TRUE))) {
        number <- gsub("90C$", "", gsub("^t", "", name, ignore.case=TRUE), ignore.case=TRUE)
        name <- paste("temperature", number, sep="")
        unit <- list(unit=expression(degree*C), scale="ITS-90")

which doesn't account for the presence of the v after the t and before the numbers.

Another issue that I see from the above is that oce is inferring the sensor "number" from the key name, but I'm not sure that's the right thing to do. In this case the sensor is clearly a "primary" sensor, in that there are no others in the file and the SBE manual identifies it as such.

As I quick fix, I'm going to add another grep() line for the tv[0-9]90C form, without the number part.

dankelley commented 8 years ago

Thanks. Your fix is fine. I am seeing lots of SBE things lately (and have been coding a lot into the kelley branch).

I feel we should bump our "2 week" plan a bit. Lately we are getting lots of SBE activity, and that happens to be a thing that is being improved in other ways, too. I don't want to release to CRAN something that will end up being wrong in a short time interval. And it seems as though almost all bug reporters can now build from GH, which is good.

My feeling is maybe to think of an end-May release ... a month of breathing room would e good.

NOTE: you are correct on the sensor number business. I am sorry not to have been writing about this but this is something I've been changing in the past 2 days. Now, I want the first temperature column to be called temperature, and the others temperature_2 etc. This is better than following the numbers in the SBE files (I argue) because it's clearer, and the summaries will soon be showing the original SBE names on the right-hand side of the table that shows min/mean/max values. That way, users can work with SBE names if they want, but oce still gets some standardly-named columns for e.g. swSigmaTheta(read.ctd("a.cnv")). Seem OK, Clark?

richardsc commented 8 years ago

I'll update with a pull request shortly. You can review that and merge/fix up as you see fit.

Now, I want the first temperature column to be called temperature, and the others temperature_2 etc

I agree with this, though I'd argue that extra fields should not use an underscore, but should be e.g. temperature2 for consistency with previous versions of oce. Also, it is extremely unlikely to have more than 9 of a given sensor.

dankelley commented 8 years ago

Let's figure out the suffix. I originally thought just e.g. temperature2 but we sometimes have temperature68 (for the old scale) and I like that name ... or do I ...??? Maybe it should just be temperature and we have the 1968 part in the unit... If I change that, then I agree on using 2, 3, etc.

So, to be clear, the proposal would be that the first column that is a temperature (no matter what scale) gets called temperature, and that any others are temperature2, etc. Agreed? (This relates to some quite-tricky code I'm working on now in the kelley branch ... it has to do with linking SBE names to columns, which I view as a critical feature, actually.)

dankelley commented 8 years ago

Te file seems ok in kelley now. I was not given the file directly but I do wonder if this should be put into our dropbox site for a test suite. That way, we'll know that future changes won't break the reading of the file. Up to Clark though, as is the closing of this issue.

richardsc commented 8 years ago

I'll check on whether we can use the file.

What are your plans for merging kelley into develop?

dankelley commented 8 years ago

I also put a bunch of Arctic files there. I think we should put some sleiwex ones also. I guess if we worry about data we could just go in an editor and chop all but 20 lines or so. That way the only sensitive info would be the ship name, etc, and I can't see that this is going to be private (unlike the actual data). Actually, I like this idea of chopping data quite a lot, since some of the files I put in there are tens of MB... If you agree, I'll shorten the Arctic ones I inserted yesterday. (PS. I replied to your comment for another issue in-place, hoping the email you get from this one will cover both.)

dankelley commented 8 years ago

Sorry, and for the merge plans, I thought I'd go through the SBE table today and add some things that seem common, e.g. I just added some more pressure variants and earlier added some other stuff (don't recall what ... the effort to add something is like 30s so it's something to do during ad breaks on Coronation St.)

richardsc commented 8 years ago

Oh, I see. You weren't proposing package tests, but just Dropbox ones. We can definitely use the file for those kinds of tests, I was just concerned about making it public.

Chopping data is fine, but since it's in Dropbox I don't really mind leaving the files in their original state. It may be that we catch other bugs that way (e.g. that one I found about reading cnv files larger than 10MB).

dankelley commented 8 years ago

If you have some time today, and can put in some sleiwex files, and if I also add a few more sbe names from the table, then I figure I'd like to merge to develop this evening, pending a clean build in dropbox.

PS. if there's any way you can think of to extract that SBE table into a text file, that would be great because I'd love to have a checklist. As it is, I really don't know what we have in the code yet, and I've seen a few things in the SBE table that look great, and I go to add them, and they are there already! But preview doesn't seem to let me copy the SBE table because of a column break pattern.

richardsc commented 8 years ago

PS. if there's any way you can think of to extract that SBE table into a text file, that would be great because I'd love to have a checklist.

You ask, I deliver. Check out the file SBE_Data_Processing_Output_Variable_Names.csv in the oce-working-notes Dropbox. You can thank tabula for the sweet pdf table export functionality!

dankelley commented 8 years ago

The develop branch (commit 02655bbd0a56608b824cb73c7d0d6da5afbff255) now seems to read the file properly.

Clark, please close this if you think it's OK. Also, I put the file in our two-person Dropbox share test site. If you think it's not ok for the file to be there, please remove it.

PS. I also added finer-scale recognition of the fluorescence type.

richardsc commented 8 years ago

This seems to be fixed. Closing