Closed dankelley closed 8 years ago
I'm shoving this in here too ... the filenames are wrong also in this file. (I know this should be a separate issue, but for this particular file everything seems broken so I'll fix it all at the same time.)
> names(o@data)
[1] " CODE=temperature," " CODE=pressure," " CODE=HCDM," " CODE=HCDT," " CODE=HCSP," " CODE=time,"
The units are fixed in develop
commit 6e587b37541d8d8f472842dd4ae90e256b30721e
Warning. one of the units is crazy. It's for an angle in a current meter, and the unit is given as mmho/cm
, so clearly there can be errors in ODF files.
I have the names working now, in develop
commit b3a0025a4e6310692e87619f8e28e4a89d4d6b13 but I have a question for @richardsc ...
Clark, do you think I should fix up units that are clearly erroneous in ODF files? I am doing that right now, in the above-named commit. It spits out the warning message
Warning message:
In ODFNames2oceNames(ODFnames, ODFunits, PARAMETER_HEADER = NULL, :
odd unit (mmho/cm) for data column named 'directionMagnetic'; setting to degrees
I feel this is the right thing to do, but it does violate an oce convention of obeying what's in the data files. My argument for fixing up the unit is that there must clearly be a user-entry aspect that can get messed up in ODF files, and there seems to be nothing to be gained by retaining something that is very clearly an error. (The code, near line 379 of R/odf.R
, is very isolated to this one error, so I cannot imagine any bleed-through.)
Oh, one more question for @richardsc -- do you like the names I'm using? This is a private file for an Aanderaa CM, and I am using names as follows. My question is whether maybe I should call the true direction as just direction
. I could call the speed as speed
but this is for a 2D instrument and I would prefer to use speed
for a full (3D) scalar speed.
temperature [°C, ITS-90] 3.522 3.6937 3.802 8831 TEMP_01
pressure [dbar] 1163.8 1163.8 1163.8 8831 PRES_01
directionMagnetic [°] 0 73.392 359.69 8831 HCDM_01
directionTrue [°] 0.172 56.938 359.47 8831 HCDT_01
speedHorizontal [m/s] 0.006 0.19572 0.443 8831 HCSP_01
time 2013-07-03 22:00:00 2014-01-03 20:00:00 2014-07-06 20:00:00 8831 SYTM_01
PS to Clark -- I can email you the file if you want to look. But it's definitely private for a while.
@dankelley wrote: Clark, do you think I should fix up units that are clearly erroneous in ODF files?
It's funny, because when I saw your comment about the units being whacky I almost wrote something that said: "Well, our job is to read the data file, not fix other people's errors".
While in general I think that should be true, obviously it's our package so we can do what we want. One question would be if this is a unit error that could occur frequently. In other words, is it better to add a few lines to oce for something that may never occur again, or to add a few lines to the processing script to deal with the issue after oce reads the file.
Probably the point is a bit moot now, because I suspect you've already made the change, and I think you should just leave it. In general, I think I would argue for the latter. Erroneous files are a pain, but I don't think we want to start trying to correct errors that occur during processing. For example, what if the error was not the unit, but the channel name? Then we would correct the unit to match the name, but that would put us in a "two wrongs make the data really wrong" situation.
As for the names, I think those are fine.
I also don't think that we've ever had a proper convention for specifying "magnetic" or "true" headings/directions. I don't mind the *Magnetic
or *True
convention, but another proposal could be that it gets included in the scale
field of the units, e.g.:
d@metadata
.. ..$ units :List of 7
.. .. ..$ heading :List of 2
.. .. .. ..$ unit : expression(degree)
.. .. .. ..$ scale: chr "magnetic"
Of course, then the issue is how to name the fields in the @data
slot.
Oh, wait ... I just realized that of course instrument heading and current direction are not really the same thing. Well, I guess they sort of are for an Aanderraa CM, but not for an ADCP or Aquadopp. I'll edit the previous comment to remove that part ...
I am a bit scared to call both direction
and let the unit scale indicate which is which, because the penalty for using the wrong direction is huge in calculating (u,v). (The problem is that the user would have to check the scale, which is tricky programmatically for novices ... we cannot rely on the True angle being first in the file, or anything.)
So I'd argue for directionMagnetic
and directionTrue
. Actually, I like having the True
there because on the phone if a person says "direction" the other person always has to interrupt and ask which one.
I am now thinking it's best to just warn on that funky unit. It never gets used in calculations, anyhow.
PS. I will make sure that as.cm()
works ok on these data ... probably if that takes more then 3 lines of code I'll make an issue.
Whoa, there is no as.cm()
. I'll add an issue and fix tonight or tomorrow.
Hm. heading
is used in data(cm)
and that data type is several years old, so we may be stuck with that. I don't think it makes sense for Interocean S4 data, though! I looked at the file used for data(cm)
and it is an ascii file that uses the word Dir
for this variable.
The word heading
makes no sense for an S4 (in my opinion) since I really think the word suggests something (a ship, a fared device, etc) that 'heads' into the wind. And the S4 is a sphere, which doesn't do anything like that!
We don't want a lot of inconsistency. The word used for these ODF current-meter data is certainly "direction", and that would argue to save it as e.g. directionTrue
in the object ... but then as.cm()
would switch it to heading
, which seems a bit confusing.
So it seems to be a choice between a bit of confusion, and a change to years-old code. I am leaning to changing that years-old code, because I cannot recall a single user (other than Clark and me) who has used Interocean S4 data, and both Clark and I can easily tweak any of our code, if we have to, for a simple name change. (Note: the plot
method can handle things internally.)
What do you think, Clark ( @richardsc ) ? Alter the existing name in cm
objects (my choice) or just live with what seems the unphysical variable name "heading"?
I agree with you. Heading indicates an instrument direction and not a current direction, so the s4 code should be changed. Aanderaa current meters and s4 current meters should both be consistently stored in cm objects.
This also suggests that the s4 code should distinguish between magnetic and true (if possible).
Good. I will recode for the new name directionTrue
tomorrow sometime. I basically have as.cm()
working but am not going to commit until a new day when my eyes are keen enough to see what I'm doing. As always, thanks!
Actually, as I look at the code I've just written, I think it should be called direction
. We don't say e.g. temperatureCalibrated
or similar. I don't think we need the postfix adjective. I think the magnetic should get a special name (directionMagnetic
), however, because if we have two direction
variables it may cause confusion. So I'll code that tomorrow morning, unless an objection appears here tonight.
I have this working now in develop
commit 8d332972b7912762ecd5cffa02ef16b2b07fafd7
I decided to call the direction variable as direction
, for reasons mentioned in the just-previous comment. This is new, so it could change if someone argues for that.
Closing.
Postscript. This confusion came about because of inconsistencies in ODF files. One file had e.g.
CODE= 'PSAL_01',
but another had e.g.
CODE=TEMP_01,
and that difference (spaces and quote marks) was messing up the decoder. I do not have a lot of confidence that the updated code is going to work for a broad suite of files but I prefer to keep this closed and open new (finer-grained) issues as and if they come up.