ices-publications / SONAR-netCDF4

The SONAR-netCDF4 convention for sonar data
10 stars 11 forks source link

Provide reference datasets #47

Open gavinmacaulay opened 2 years ago

gavinmacaulay commented 2 years ago

We need reference datasets for validation of code that reads and writes files in the convention format. Preferably files that are small enough to store in this repository (if not, we can arrange an online place to store the larger files). Files should be accompanied by a small description of their contents. Any files provided should have no distribution restrictions on them.

(Suggested by @lberger29)

lberger29 commented 2 years ago

A reference dataset for EK60 and ME70 data is available at https://www.seanoe.org/data/00475/58652/

geoffmatt commented 2 years ago

This is a good idea, but some care should be taken here. a) It would be good to have some reference for what the data should look like when loaded. Possibly in another format like a Simrad one so that it can be loaded in both formats and compared. b) @lberger29 - Laurent, I'm not sure your dataset is compliant. The backscatter_r and backscatter_i variables are 3-dimensional in those files, whereas the CRR specifies that they should be 2D. Because of this Echoview will not read these files at the moment

lberger29 commented 2 years ago

a) The corresponding Simrad proprietary raw files have been made available at the same location for the reference dataset (https://doi.org/10.17882/58652) b) The subbeam dimension allows to store the raw data of each subarray for a split-beam transducers in the current version of the convention and is an optional dimension. The dataset provided is however compliant with a subbeam dimension equal to one and this dimension is needed for EK80 Sonar-Netcdf data files with complex samples.

geoffmatt commented 2 years ago

a) Yep, my mistake, I can see the Simrad files in that location, sorry about that b) I know I'm sounding pedantic, because the subbeams have been added to the development version of the format, but they are not in the currently published version. Until V2.0 is published we cannot consider it the "current version", because we cannot code support for a format which is changing on a daily basis. This is a very frustrating problem for us at Echoview. If these files are to be considered "reference" they need to be associated with a specific version of the format, otherwise they cannot be reliably utilised as a reference. Seeing as they have been created against the current iteration of V2, they should be associated with an exact version, i.e. 2.0.13. This is important because V2 is likely to change again before publication, in which case these files may not be compliant with that newer V2.

lberger29 commented 2 years ago

I agree Geoffroy that we need to publish a version 2.0 of the convention. I want to add some adiditionnal variables for properly handling filters and pulse shape in the beamgroup but this can be in version 2.1 @gavinmacaulay Gavin, do you think that we are able to publish a version 2.0 of the online convention?

cyrilleponcelet commented 2 years ago

I agree, we should go as soon as possible to a V2.0 release. I would prefer to go through a kind of final review process planning something like :

This could lead to a final release somewhere in december.

gavinmacaulay commented 2 years ago

I agree on Cyrille's plan and suggest we turn this into a procedure document that we follow for this and future releases - I can make a PR for this

gavinmacaulay commented 2 years ago

Please look at PR #59