Closed cgrdn closed 2 years ago
More relevant info for what is in the masked array:
>>> prof['FLU1'].adjusted.data
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32)
>>> prof['FLU1'].adjusted_qc.data
array([b'', b'', b'', ..., b'', b'', b''], dtype='|S1')
Hi! I'm glad this is getting used!
I'm happy to help! Without a smaller example to copy/paste, my best guess is that I never implemented adjusted stuff in the VMSProfile
:
Thanks @paleolimbot, sorry for the long delay, got busy with some other things.
So if I understand correctly, the key section is in https://github.com/cgrdn/medsrtqc/blob/9d38d1981f18f2686cc4fb3f1fc193178bd31229/medsrtqc/vms/core_impl.py#L125-L138, specifically line 132 because the only field that is being processed/looped through is v.qc
.
Now the field that is being updated (m["Q_PARAM"]
) is field within the VMSProfile
, yes? And if yes, then my question is where can I find out which VMSProfile
fields correspond to Trace.adjusted
and Trace.adjusted_qc
?
If what I have above is right, I'm not sure this will be necessary anymore, but I did write up a minimal working example (you'll need to use my branch to run it):
from medsrtqc.vms import read_vms_profiles
from medsrtqc.resources import resource_path
from medsrtqc.qc.chla import ChlaTest
# load a bgc VMS file (source files from coriolis)
profs = read_vms_profiles(resource_path('bgc_vms.dat'))
prof = profs[0]
test = ChlaTest()
# I edited test.run to return the Trace() for ease here
chla = test.run(prof)
# show trace values where chla.adjusted and chla.adjusted_qc have been updated
print('chla Trace:')
print(chla)
# show that those fields have not been updated in prof
print('"FLU1" Profile field:')
print(prof['FLU1'])
Which for me gives the following (there is more output from test.run()
but these are the relevant bits):
chla Trace:
Trace(
value=[0.2774, 0.2847, 0.2774, [1288 values], 0.0365, 0.0292, 0.0365],
qc=[b'3', b'3', b'3', [1288 values], b'3', b'3', b'3'],
adjusted=[0.67160004, 0.67160004, 0.67160004, [1288 values], 0.00365, 0.0, 0.00365],
adjusted_qc=[b'1', b'1', b'1', [1288 values], b'1', b'1', b'1'],
adjusted_error=[masked, masked, masked, [1288 values], masked, masked, masked],
pres=[-0.1, -1.7014118e+38, -1.7014118e+38, [1288 values], 979.4, 979.4, 979.3],
mtime=[masked, masked, masked, [1288 values], masked, masked, masked]
)
"FLU1" Profile field:
Trace(
value=[0.2774, 0.2847, 0.2774, [1288 values], 0.0365, 0.0292, 0.0365],
qc=[b'3', b'3', b'3', [1288 values], b'3', b'3', b'3'],
adjusted=[masked, masked, masked, [1288 values], masked, masked, masked],
adjusted_qc=[masked, masked, masked, [1288 values], masked, masked, masked],
adjusted_error=[masked, masked, masked, [1288 values], masked, masked, masked],
pres=[-0.1, -1.7014118e+38, -1.7014118e+38, [1288 values], 979.4, 979.4, 979.3],
mtime=[masked, masked, masked, [1288 values], masked, masked, masked]
)
Totally get the 'getting busy with other stuff' bit!
I actually think the best thing here would be for us to meet and we can debug it together since that might help the next time you run into a mistake I made when writing this! I think you have my gmail...Monday is particularly good for me if you can make that work.
Putting some notes from our call here:
And some communication with Anh on the subject:
Hi Anh,
[...]
One of the challenges I described at our last Argo Canada meeting was that I was having trouble getting the CHLA_ADJUSTED and CHLA_ADJUSTED_QC fields saved to a
Profile()
object in the python package and therefore to the VMS .dat file. If I understand correctly, there isn’t really an appropriate place to store the adjusted values and flags in the VMS file. So if that is the case I think there are two options:
- Since I am already calculating CHLA_ADJUSTED and CHLA_ADJUSTED_QC, we can just store those in a temporary text file, and instead of putting them in the VMS file, have a short step in the processing chain to put them in the netCDF file after is has been created. I can write this up.
- Add a field to be VMS file for the adjusted values (“FLUA”? “FLU7”?).
I don’t know how complicated it is to manipulate the structure of the VMS file so I tend to lean towards option 1 but let me know what you think!
Thanks, Chris
For now, I plan to save it to a temporary text file and get it later. If this changes I will re-open the issues.
After speaking w/ Anh today she suggested that since it is a whole profile of new values, we should create a new VMS file variable. We will simply decide on a name for it, and it will already exist by the time it gets to the point in the processing chain. Will need to do some careful updates to the code since a Trace()
object contains an adjusted
field, but we will using the values in the adjusted
field to update the value of a new variable name value
field.
My initial instinct is to do something like this to update the usual value field with the adjusted numbers:
self.update_trace("FLU1", chla)
self.update_trace("FLU7", chla, use_adjusted=True)
Somewhat implemented this in 028cfce. @paleolimbot I am a bit out of my depth in terms of understanding how __setitem__(self, k, v)
(what is updated in the mentioned commit) gets used by update_trace()
, and therefore am unsure how I would pass use_adjusted=True
in the ChlaTest()
code.
My attempt at understanding:
update_trace()
will use the version of __setitem__()
defined in core_impl.py when called. But since this is implicit (not sure if that term is appropriate here, abstract?) I can't necessarily pass use_adjusted=True
in update_trace()
terribly easily. Would implementing something like defining __setitem__(self, k, v, **kwargs)
work here for it to collect anything else update_trace()
has?
I'll try to figure this out on my own but any direction is appreciated!
I hope this helps!
First, I think the term that applies here is "built-in"...the __setitem__
syntax is mapped to item[key] = value
and there's nothing you can do to pass extra parameters there.
Stepping back from __setitem__
, what exactly are you trying to do? I don't think what you actually want to do is pass extra parameters to __setitem__
(you probably want to do something like self["adjusted parameter name"] =
Trace(pres, value = this_trace.adjusted_value, qc = this_trace.adjusted_qc)`, if that makes any sense)
Right that makes sense. There's no requirement to use update_trace()
and therefore __setitem__
. Can just do something like:
self.update_trace('FLU1', chla)
self.profile['FLU7'].value = chla.adjusted
self.profile['FLU7'].qc = chla.adjusted_qc
Thanks Dewey!
Related, this test currently failing in my fork
def test_vms_profile_update_adjusted(self):
# load a bgc VMS file (source files from coriolis)
profs = read.read_vms_profiles(resource_path('bgc_vms.dat'))
prof = profs[0]
chla_trace = prof["FLU1"]
chla_trace.qc[0] = b'5'
chla_trace.adjusted = chla_trace.value
prof["FLU1"].adjusted = chla_trace.adjusted
chla_trace_updated = prof["FLU1"]
self.assertTrue(np.all(chla_trace_updated.adjusted == chla_trace.value))
For now, where there is nowhere to put adjusted updates in the file except for a new variable (which the warning directs you to), I am closing this issue.
This is in the context of running/testing the ChlaTest() component. After running this test script, the
Trace()
variablechla
has filled in values for adjusted and adjusted_qc. However after runningself.update_trace('FLU1', chla)
at the end of the test, these fields remain "empty" (masked arrays) without the new values.This appears to be specific to these fields, as the value and qc fields are both changed properly.
Some relevant output:
chlorophyll BEFORE testing, see log file here for more detail
chlorophyll AFTER testing
@paleolimbot if you have any quick suggestions I would love to hear them!! But also this is not your problem anymore so don't spend too much time on it just thought I would check with you :)