Closed emiliom closed 2 years ago
Adding some specifics to this:
Many variables are not specified in the convention. I've been putting them in the Beam
and Vendor
group somewhat arbitrarily depending on their dimension and how central it feels to be bundled together with backscatter measurements. For example, split-beam alongship and athwartship angles are in Beam
group because they obviously should go with backscatter measurements, but the filter coefficients needed for processing broadband data are in the Vendor
group. It'd be nice to review and make decisions on these.
Echopype conversion currently aims at preserving all info from the raw file. This conflicts with SONAR-netCDF4 in places when the raw recording requires some transformation to have physically meaningful units. For example, EK60 raw backscatter and angle data need to be scaled by specific constants to be the real "power" and "angle." This is likely a larger question related to the convention, but I think we should have a clear strategy on this and put it in the documentation.
GPS data from EK60/EK80 sometimes include different NMEA sentences. The spatial resolution appears to be different. Currently the default is to only save the GGA
sentences unless others are specified by the user. @emiliom @lsetiawan since you've dealt more with spatial data perhaps you have some insights on this.
Notes from call between @emiliom and @leewujung on 2021-1-20:
We've laid out the sequence of tasks as follows:
Define a reasonable end point / goals where we want to be by the time we submit the echopype paper.
Also refer to the "data format" section in the docs: https://doc-test-echopype.readthedocs.io/en/latest/data-format.html
For AZFP:
tilt_x
and tilt_y
in the Beam group that are derived from the raw data, and also have the raw sensor record in both the Beam and the Vendor grouptemperature
in the Environment group that is derived from the raw data, but we have the raw sensor record in the Vendor group.In the above cases the derived data are computed from raw sensor readings (in Vendor/Ancillary
) and a few sensor parameters.
Questions to consider:
For EK60:
angle_alongship
and angle_athwartship
in the Beam group are currently the electrical angles recorded by the sensorbackscatter_r
in the Beam group is currently the "power" from the raw dataIn both cases these values need to be further converted to physically meaningful quantities.
Things to consider:
backscatter_r
so it seems fine that we use encode power into the converted file, and converting it to Sv during calibration@leewujung can you point to netcdf files generated by echopype that you'd consider "exemplary" with respect to the extent and quality of metadata included? Ideally one for EK60 and one for AZFP, but having just the EK60 one would be fine for now. Don't send me two for a given instrument type! In this case, two is not better than one. If the file is on the repo, just include the link here; if on Google Drive, the directory hierarchy and file name. Thanks.
Here you go, in Google Drive:
Perfect, thanks!
For echopype/convention_check/DY1801_EK60-D20180211-T164025.nc
:
Sonar/sonar_model
attribute entry is "ER60". I assume that's a mistake, and it should be "EK60"?Provenance
group. Specifically, is this from OOI, a ship (Hake survey?), or something else? Platform
group global attributes are not presentThe Beam
group in DY1801_EK60-D20180211-T164025.nc
has 30 data variables, but in 17082117.nc
it only has 18 and several of those don't seem to be from the convention. On the other hand, 17082117.nc
adds many global attributes (17) beyond the 2 defined in the convention and found in DY1801_EK60-D20180211-T164025.nc
; except, the mandatory beam_mode
attribute is empty.
Is 17082117.nc
from OOI? As with the other nc file, Platform
global attribute entries are absent (more specifically, the attributes are present but empty, whereas they're absent altogether in the EK60 file).
For
echopype/convention_check/DY1801_EK60-D20180211-T164025.nc
:
Sonar/sonar_model
attribute entry is "ER60". I assume that's a mistake, and it should be "EK60"?
That is unexpected! ER60 is the software that runs EK60, see here.
From the spec sheet @ngkavin made the ER60 is in Sonar/sonar_software_version
. Though that's also only partially correct, because it should be with the version number also.
TODO: Let's mark this version number to be something to check.
- There is no indication I can find about the origin of the data, other than the raw file path in the
Provenance
group. Specifically, is this from OOI, a ship (Hake survey?), or something else?Platform
group global attributes are not present
Based on convention v1.0 those should be in the Platform group global attributes but they are also optional. These attributes are what users can set when they convert the data. That particular method is working but is also one of what is left to be double checked for completing the convert overhaul. It is surprising that the attributes are absent altogether here compared to AZFP.
TODO: add global attributes to Platform
group in EK60
TODO: the steps to set the platform group should be included in the docs.
The
Beam
group inDY1801_EK60-D20180211-T164025.nc
has 30 data variables, but in17082117.nc
it only has 18 and several of those don't seem to be from the convention. On the other hand,17082117.nc
adds many global attributes (17) beyond the 2 defined in the convention and found inDY1801_EK60-D20180211-T164025.nc
; except, the mandatorybeam_mode
attribute is empty.
Yes. I made some decisions on which variables to add to the Beam
group when trying to force the AZFP format into SONAR-netCDF4. The AZFP Beam
group lacks some of those in the EK60 data because it is a single beam instrument, and there are also simply variables that do not have correspondence in the AZFP file, and vice versa. In my view these are the things we can push for to improve the convention for wider usage.
TODO: add beam_mode
to AZFP Beam
group.
Is
17082117.nc
from OOI? As with the other nc file,Platform
global attribute entries are absent (more specifically, the attributes are present but empty, whereas they're absent altogether in the EK60 file).
Yes this file is from OOI. See the previous comment re. setting these values.
@emiliom : thanks for going through the convention and doing the checks. I think we can compile a list of things to change and questions to be discussed (either for ourselves or with the ICES group) from these back and forth we're having here, and take the FAST meeting opportunity to link up with the ICES group, and ask @ngkavin to help change the code.
Thanks for the follow ups! I'll digest them hopefully on Monday.
I've compiled the CDL-like tables of attributes, dimensions, coordinate variables, variables, types and subgroups, per group, into a spreadsheet that I've loaded into Google Drive. This is directly from the tables in section 2.10 ("Groups"). Since copying and pasting from the pdf into a true tabular format is very fragile, I went to the first GitHub version (first commit) of the SONAR-netCDF4 document in asciidoc format and pulled out the specific files corresponding to each of the Groups (e.g., tableSonar.adoc
). That first version from May 2019 is hopefully version 1.0, or very close (I'll double check). I then transformed the files a bit to be able to paste them into an Excel spreadsheet.
These spreadsheets are now in a form that I can compare directly to Kavin's instrument-specific tables, to do the actual 1:1 compliance assessments for each variable, attribute, etc.
For echopype/convention_check/DY1801_EK60-D20180211-T164025.nc:
- Sonar/sonar_model attribute entry is "ER60". I assume that's a mistake, and it should be "EK60"?
That is unexpected! ER60 is the software that runs EK60, see here.
Currently EK60 Sonar/sonar_model
is set in convert/set_groups_ek60.py#L87 as self.parser_obj.config_datagram['sounder_name']
, so that looks fine, unless the datagram sounder_name
is parsed incorrectly.
TODO: Verify that the datagram parsing that populates parser_obj.config_datagram['sounder_name']
is correct. If it currently is leading to a value of "ER60", that's incorrect.
From the spec sheet @ngkavin made the ER60 is in Sonar/sonar_software_version. Though that's also only partially correct, because it should be with the version number also.
TODO: Let's mark this version number to be something to check.
Actually, per the sample nc file, sonar_software_version
is being assigned correctly. It's the spec sheet that seems wrong.
TODO: Populate Sonar/sonar_software_name
.
Right now it's left blank; see convert/set_groups_ek60.py#L89
Should we create a new issue for these EK60 Sonar
global attributes TODOs?
Based on convention v1.0 those should be in the Platform group global attributes but they are also optional.
True, they're optional. But what we're evaluating here is software compliance against all elements of the convention, including recommended and optional ones. Ideally, the echopype-generated "exemplary" nc files should have all elements populated, including optional ones, to enable this compliance assessment.
These attributes are what users can set when they convert the data. That particular method is working but is also one of what is left to be double checked for completing the convert overhaul. It is surprising that the attributes are absent altogether here compared to AZFP.
TODO: add global attributes to
Platform
group in EK60
Actually, since those global attributes are defined by the conventional as optional, it's unclear if they should be absent when not populated, vs present but populated with a blank value. I'll have to look more closely at the convention doc. Regardless, the behavior needs to be identical across echosounder instruments.
TODO: the steps to set the platform group should be included in the docs.
Done! You've had that all along, and I just re-organized and tweaked things a bit.
Regarding this:
TODO: add
beam_mode
to AZFP Beam group.
Looks like we also need to add beam_type
From @lsetiawan in #283:
Recently, I came across this: https://github.com/ices-publications/SONAR-netCDF4/blob/master/docs/convention_definition.ncml, which is a data convention definition for SONAR-netCDF4. It would be great for echopype to be able to read this and check against it and get the proper attributes/variables in creating a standardized file.
Great! And sigh :weary: . This weekend I spent a good chunk of time creating an analog of this ncml file by transforming the convention's asciidoc tables in the same directory. See my comment above, https://github.com/OSOceanAcoustics/echopype/issues/210#issuecomment-822196932. I didn't see this ncml file, and that would probably have saved me effort. My goal is to use it for echopype compliance assessment, though initially it's a manual exercise. Anyway, great to have it.
@emiliom Any suggestion on tools to read this ncml file into code?
Brief answer: I did a quick check, and it looks like nothing exists right now (the more relevant packages are old, deprecated ones linked to IOOS). Longer answer: It looks like it's been on the xarray radar (see https://github.com/pydata/xarray/issues/2697), but nothing exists AFAIK. Still, I don't think we're in a position to use such a feature, and my approach and recommendation is that compliance checking be done externally on files generated by echopype. The need for compliance checking tools is a broader SONAR-netCDF4 topic, and there has been some discussion about it at the SONAR-netCDF4 repo. There is also the reality that echopype does not aspire to adhere exactly to the SONAR-netCDF4 v1.0 spec, so we'd have to use a different, echopype-tuned ncml.
I guess you do belong in the sonar-conventions
Slack channel after all! :stuck_out_tongue_winking_eye:
Thanks for all of those explanation and your work trying to create an analog of the ncml file! You're such a convention expert! :smile:
My first attempt to handling these conventions is creating a yaml file that somewhat describe the convention in https://raw.githubusercontent.com/OSOceanAcoustics/echopype/e445cda7cec8ea703ec6bcd6205186ee9a463c82/echopype/echodata/convention/1.0.yml. I am currently using that file to dynamically create the attributes in EchoData.
My thoughts on this is possibly using pydantic in the future to describe the standardized EchoData attributes that somewhat adhere to the spec. This would actually help with the validating with minimal effort. But I'm still trying to wrap my head around how to integrate this.
I guess you do belong in the sonar-conventions Slack channel after all!
Haha I am aspiring to be as good with conventions/standards as much as you! :smiley: After all you were the one who trained me in the beginning :stuck_out_tongue:
From PR #294, regarding calibration and environment parameters used in calibrate.compute_Sv/S
:
Right now we are not following any specific convention for the calibrated dataset, so the cal and env parameters are encoded in the form used in echopype.
@emiliom: tagging you here as this is relevant to the convention/processing level discussions going forward.
Following up on my pydantic ideas. I have started echopydantic, where I think we can really define the convention as python data models. https://github.com/lsetiawan/echopydantic
You can see it in action on this development notebook: https://github.com/lsetiawan/echopydantic/blob/main/notebooks/Development.ipynb
Still have to think about the implementation of this, regarding of how this mixes with xarray objects, etc.
@lsetiawan it's great you're doing that R&D! I don't have time this week to look at your stuff, though. I think your work is somewhat different from the focus of this issue (yeah, I know: saying this issue has a "focus" is a bit laughable). This issue is about assessing current convention compliance and identifying individual desirable/required fixes. Your work with pydantic instead is about developing an overarching implementation approach that has the potential to minimize (but not eliminate!) compliance failures from the get go. If you agree, could you open a new issue and copy and paste into it the relavant comments in this issue?
@leewujung in the Beam
subgroup, the convention defines the mandatory beam_direction_x|y|z
variables (x|y|z) as having ping_time
and beam
dimensions, where beam
is effectively frequency. But in the echopype implementation for EK60, the variables do not have ping_time
as a dimension. Why is that?
The 3 variables are completely absent in the AZFP exemplar file, but there are so many implementation issues with AZFP's that I'll just set it aside for now.
@leewujung in the
Beam
subgroup, the convention defines the mandatorybeam_direction_x|y|z
variables (x|y|z) as havingping_time
andbeam
dimensions, wherebeam
is effectively frequency. But in the echopype implementation for EK60, the variables do not haveping_time
as a dimension. Why is that?
In EK60 these beam direction parameters are parsed from the configuration datagram, and therefore does not change within each file (i.e., the config is identical within each file), hence there is no ping_time
dimension. Now that you brought this up, I think we can add a ping_time
dimension with a single timestamp from the configuration datagram (this is the same approach as what we did for the EK80 environment datagram -- but I know you don't want to look at EK80 just yet!).
The 3 variables are completely absent in the AZFP exemplar file, but there are so many implementation issues with AZFP's that I'll just set it aside for now.
AZFP data file does not contain these parameters in itself, so I guess in a sense these parameters are more like "metadata" for AZFP data.
[EK60] Now that you brought this up, I think we can add a
ping_time
dimension with a single timestamp from the configuration datagram (this is the same approach as what we did for the EK80 environment datagram -- but I know you don't want to look at EK80 just yet!).
(flagging as a TODO) So what you mean is adding a new ping_time
dimension and coordinate variable separate from the existing ping_time
, right? While that's not compliant strictly speaking, I think it's way better, and 100% adherence to the convention would mean a huge and unnecessary bloat of repeated values! Sounds like another area where the convention is sub-optimal, at least for echosounders.
AZFP data file does not contain these parameters in itself, so I guess in a sense these parameters are more like "metadata" for AZFP data.
Got it. Are they in the XML file? Even if they're not in the data or XML files, I assume this is information that would be important to have and is not otherwise derivable from the existing (meta)data in the echopype-generated AZFP file, right?
AZFP data file does not contain these parameters in itself, so I guess in a sense these parameters are more like "metadata" for AZFP data.
Got it. Are they in the XML file?
I don't think they are in the XML files.
Even if they're not in the data or XML files, I assume this is information that would be important to have and is not otherwise derivable from the existing (meta)data in the echopype-generated AZFP file, right?
Yes. These are related to the tilt parameters that we discussed last time. Perhaps we should rethink if the tilt params should be in the Platform
. I think it depends on what we call as the platform. In the typical situation when an AZFP is moored, with a cage that is vertically inline with the mooring but the transducers are actually mounted with a 15 deg angle from vertical, i guess the tilt params have better correspondence to beam_direction_x|y|z
here, though it requires a transformation with strict definition of the origin and axis orientation of the "platform" (the mooring cage) to get to the beam_direction_x|y|z
values -- and by doing the transformation, I think then these values are derived values already...
Since we've been using this to document changes needed wrt the convention, I just found another one:
In the Top-level group, for some reason we have an attribute survey_name
that is not present in the convention. I don't remember why I added this in the beginning, but I don't really see it as necessary if the title
and summary
attributes are properly populated.
Yes! That was on my list of things to ask about.
We considered things like survey_name for SONAR-netCDF4, but in the end stayed with the CF conventions for a more generic set of metadata and also the fact that research metadata like survey name, etc, is covered by the AcMeta convention (which can be inserted into or associated with SONAR-netCDF4 files if desired).
Thanks @gavinmacaulay !
In going through #409 I found some inconsistency in how echopype stores info in the Sonar
group due to the added flexibility of EK80 -- my memory has faded but it was probably that I made arbitrary decision to store things this way. We should definitely include this in our convention check. 🙂
NOTE: I have yet to tackle the WBT+GPT combination in one of @gavinmacaulay 's ES80 test files es80/WBT-GPT-D20181216-T111107.raw
Parsing test_data/ek60/DY1801_EK60-D20180211-T164025.raw
I got:
Parsing echopype/test_data/es70/D20151202-T020259.raw
I got:
Parsing echopype/test_data/ek80/D20190822-T161221.raw
I got:
Parsing echopype/test_data/es80/WBT-D20210620-T012250.raw
I got:
This reminds me that my initial decision to set aside EK80 and focus on AZFP and EK60 has inadvertently persisted for way too long! Plus, now I/we need to add AD2CP (and others?) to the mix.
Ha... I think adding in EK80 in the check would be valuable because EK60 has on its way to be obsolete, but we could probably still set aside AD2CP for the moment (though selfishly I'd really want to have that in!).
I think we should close this sprawling issue. Much of its content has been captured in https://github.com/uw-echospace/standardization/blob/main/echopype_sonarnetcdf4_compliance.md. We can move other content from this issue to that document, or to new documents and issues on https://github.com/uw-echospace/standardization, as needed. echopype_sonarnetcdf4_compliance.md will retain a link to this issue to help us dig out additional details still present here.
We'll open specific, narrow issues as needed.
Sounds great!
Great. I'll close the issue now, and we'll just keep mining it as needed. There are lots of important details here that we'll refer back to.
Let's go through the current implementation of SONAR-netCDF4 convention. We can maintain a markdown table to manually document what is currently covered, what sections of the convention it corresponds to, confirmation that it's properly implemented, etc
SONAR-netCDF4 convention: