ERDDAP / erddap

ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source (Apache and Apache-like) Java Servlet from NOAA NMFS SWFSC Environmental Research Division (ERD).
Creative Commons Zero v1.0 Universal
83 stars 58 forks source link

Unexpected creation of `subsetVariables` attribute when loading a dataset #180

Open honzaflash opened 1 month ago

honzaflash commented 1 month ago

Describe the bug When using EDDTableFromNcCFFiles ERDDAP unexpectedly shows sourceVariables attribute in dataset's metadata even though the attribute is not added in the datasets.xml configuration nor is it present in the source netCDF file. Furthermore, it seems to use source variable names - this means that if you rename the variables by using a different <destinationName> the dataset won't load. Also when using the GenerateDatasetsXml tool the sourceVariables attribute also shows up in the commented "sourceAttributes" section (despite not being an attribute in the source file).

To Reproduce I will add a separate comment with example files, xml, and instructions.

Expected behavior sourceVariables attribute continues to be generated but using variable destination names. sourceVariables will be printed under <addAttributes> and not in the source attributes section when using GenerateDatasetsXml scripts. Also this behavior should be documented. I did not find any mention of the attribute being generated besides for SOS datasets.

Additional context I have traced the problem for dataset xml generation. I believe the "sourceAttributes" in the output xml come from here: https://github.com/ERDDAP/erddap/blob/2ef97c8207ab161126f4de419dd2289a0dd9be04/WEB-INF/classes/gov/noaa/pfel/erddap/dataset/EDDTableFromNcCFFiles.java#L467 Table class is used for the dataSourceTable: https://github.com/ERDDAP/erddap/blob/2ef97c8207ab161126f4de419dd2289a0dd9be04/WEB-INF/classes/gov/noaa/pfel/erddap/dataset/EDDTableFromNcCFFiles.java#L318 readNcCF sets the global attribute to a value computed from other attributes: https://github.com/ERDDAP/erddap/blob/2ef97c8207ab161126f4de419dd2289a0dd9be04/WEB-INF/classes/gov/noaa/pfel/coastwatch/pointdata/Table.java#L8301

honzaflash commented 1 month ago

Steps to reproduce

Also

  1. With ERDDAP container running, connect to a shell in it using docker exec -it erddap-container bash
  2. Navigate to the necessary dir and run the GenerateDatasetsXml.sh script
  3. Choose EDDTableFromNcCFFiles, input the path to the linked nc file, and generate a xml template
  4. See that the output template has susbsetVariables in the commented "sourceAttributes" section