OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
95 stars 73 forks source link

Varying variables in `Sonar` group amongst sensors #681

Closed b-reyes closed 1 year ago

b-reyes commented 2 years ago

While reviewing the Sonar group across the sensors, I noticed that there is a slight inconsistency in what is an attribute and what is a variable. For EK80 we have the following variables in the Sonar group, which all have the dimension (channel)

I know that for EK80 it absolutely makes sense to make these variables as they may vary amongst the channels.

For EK60 we only have beam_group_descr as a variable, however, we do have the attributes sonar_model, sonar_serial_number, sonar_software_name, sonar_software_version, which are variables in EK80.

For AZFP we also only have beam_group_descr as a variable, but we do have the attributes sonar_model, sonar_serial_number, sonar_software_name, sonar_software_version, which are variables in EK80.

I was curious if there was a specific reason why these were left as attributes or if this was simply something that was over looked. If we want to bring consistency to these variables, then I suggest that we make these attributes into variables with the dimension channel and add the variable frequency_nominal to the Sonar group (for EK60 and AZFP).

Note that I am not considering AD2CP here.

emiliom commented 2 years ago

Quick reply:

b-reyes commented 2 years ago

Quick reply:

  • All the sonar_* attributes are defined as such by the convention. For EK80, based on what I've seen, I assume @leewujung found it necessary to change them to a variable in order to support different entries per channel (likely a need not anticipated in the convention vers. 1). Changing the attributes on the other instruments to variables would bring consistency across echopype at the expense of adherence to the convention.

I overlooked the fact that these attributes may be in the convention. @emiliom thank you for brining this to my attention. Since we have already "broken" the convention for EK80, I don't really see convention adherence as a strong argument for not making these attributes into variables for EK60 and AZFP. I am now starting to lean towards the side of making these attributes into variables (so that we can have some consistency across sensors). @emiliom and @leewujung what do you think?

  • beam_group_descr was introduced by us and is not from the convention. It's currently implemented identically across instruments.

Yes, I am aware that this was introduced by us. I just wanted to put it here so that I provided everyone with a full picture of what variables were present at the moment.

leewujung commented 2 years ago

Related to #571

emiliom commented 2 years ago

I'll focus on sonar_model first. It brings an extra complexity/confusion to the table that I'd like to address first, because there is both the convention attribute sonar_model and our echopype use of sonar_model as a very important argument across the board which is also present as the EchoData property sonar_model. That sonar_model echodata property is used extensively in echopype (eg, in combine), and its origin is either user-specified via an open_raw argument or implicitly generated via heuristics.

The assignment of the Sonar.sonar_model attribute (or variable, for EK80) is done completely independently of that echodata property. For AZFP and AD2CP, a fixed string is assigned ("Acoustic Zooplankton Fish Profiler" and "AD2CP", respectively). For EK60 and EK80 it's extracted from the raw datagrams, and the outcome is all over the place. For EK60, it's "ER60" (as noted in #571). For the EK80 files I've tested, the vector of values never contains the string EK80, and the datagram variable it uses is called "transducer_name"; here's one example: ['ES70-7C' 'ES120-7C' 'ES200-7C'].

So: A raw file is characterized by a single sonar_model. That's how echopype operates. Currently that driving value is in the echodata sonar_model property. The Sonar.sonar_model attribute or variable are never used after being assigned. There are ramifications to the occurrence of these parallel sonar_model values. One is what happens when using open_converted. The resulting echodata object can only know its sonar_model via the sonar_model attribute/variable in the converted file. For AZFP, EK60 and EK80, none of those will correspond to our canonical sonar_model codes (from core.py: SonarModelsHint = Literal["AZFP", "EK60", "ES70", "EK80", "ES80", "EA640", "AD2CP"]). But this inconsistency could bite us in other, unforeseen ways.

My recommendation:

Making these changes won't have any negative impacts right now, precisely b/c the Sonar.sonar_model attribute/variable are not used. They're just metadata that are assigned then ignored. One likely impact, though, is on @b-reyes' 0.5x to 0.6x conversion code and tests in PR #685.

emiliom commented 2 years ago

All other Sonar variables/attributes @b-reyes listed that are based on the convention are also assigned then ignored. That means that there are few, if any downstream consequences if we make changes, or in the current inconsistency with EK80 setting them up as variables. The exception I'm seeing is in the v0.5x to v0.6x conversion code and tests (PR #685).

(update: removed statements about the missing converted test files and Provenance.conversion_software_version reading, and moved them to PR #685)

emiliom commented 2 years ago

Decisions for v0.6.0:

leewujung commented 2 years ago

Notes while reviewing #685 as a reminder: Under v05x_to_v06x._modify_sonar_group we fill NaN to missing data. For cases like this, may be a good idea to have an overarching statement in the docs to say that some variables that were not encoded in v0.5.x files are filled with NaN when converted to v0.6.x files.

emiliom commented 1 year ago

In https://github.com/OSOceanAcoustics/echopype/issues/681#issuecomment-1132018034 above, last May, I laid out decisions made at that time, for v0.6.0. I'm listing them here, with comments on the status.

Proceed with the changes to sonar_model as described above

Done in PR #692 for 0.6.0

Rename EK80 sonar_model Sonar variable to transducer_name

Done in PR #692 for 0.6.0

No other changes will be done at this time. We will leave the issue open for ongoing discussions about the larger variable vs attribute consistency issue

I'll follow up in another comment.

(Note: sonar_manufacturer and sonar_type were already in good shape for all the sonar models. These attributes are not part of the scope of this issue, but I'm mentioning them here for completeness)

emiliom commented 1 year ago

Here I'm listing sample values from EK60, ES70 and AZFP for sonar_serial_number, sonar_software_name and sonar_software_version as Sonar global attributes.

EK60

ES70

AZFP

emiliom commented 1 year ago

For EK80, the analogous information is encoded as variables. I examined 5 EK80 converted raw files from our test data. Here are some samples and tentative comments.

EK80

ek80/D20170912-T234910.raw

Variables that correspond to convention global attributes:

Additional, custom variables, which don't correspond to a convention global attribute:

ek80/D20190822-T161221.raw

Variables that correspond to convention global attributes:

Additional, custom variables, which don't correspond to a convention global attribute:

Observations

emiliom commented 1 year ago

Following up on my last comments, above. Specifically, for EK80:

This suggests that we should change them back to global attributes instead, as specified in the convention. @leewujung can you foresee a situation where sonar_software_name and sonar_software_version would vary across channels?

@leewujung , we spoke at one point and I remember you agreeing that we can change these variables to be attributes. So, I can implement this in the upcoming release, I think.

leewujung commented 1 year ago

Yes, I think that these variables can just be attributes since they are unlikely to vary by channel -- I don't know if there's an option to do so even! If you could implement it in this release (we are talking about v0.6.4, right?) that'll be great!

emiliom commented 1 year ago

@leewujung I have a batch of small questions for you:

leewujung commented 1 year ago

Please see my review for #992.

For the first question about ES70: I am not sure, but since they share the same datagram structure, that is likely the case.

emiliom commented 1 year ago

I think we can close this now. Any objections?

leewujung commented 1 year ago

Agreed!