kratzert / Caravan

A global community dataset for large-sample hydrology
BSD 3-Clause "New" or "Revised" License
177 stars 35 forks source link

Missing attributes in many netCDF files #33

Open BSchilperoort opened 2 months ago

BSchilperoort commented 2 months ago

In the latest release of caravan around 6000 of the netCDF files are missing dataset-level attributes (and don't have variable-level attributes either).

The files missing attributes are in the list attached; missing_attrs.txt

kratzert commented 2 months ago

Given the number, it seems to be the netCDF files of the original Caravan dataset. Interesting, probably happened when I extended them to the longer periods. And with attributes you mean the info I included about timezone and units?

BSchilperoort commented 2 months ago

Interesting, probably happened when I extended them to the longer periods

That can happen, not all operations in some software will maintain the attributes.

And with attributes you mean the info I included about timezone and units?

Yep, you can view the attributes with ncdump -h filename.

A valid file has the following:

// global attributes:
        :Units = "snow_depth_water_equivalent: ERA5-Land Snow-Water-Equivalent [mm]\n",
            "surface_net_solar_radiation: Surface net solar radiation [W/m2]\n",
            "surface_net_thermal_radiation: Surface net thermal radiation [W/m2]\n",
            "surface_pressure: Surface pressure [kPa]\n",
            "temperature_2m: 2m air temperature [°C]\n",
            "u_component_of_wind_10m: U-component of wind at 10m [m/s]\n",
            "v_component_of_wind_10m: V-component of wind at 10m [m/s]\n",
            "volumetric_soil_water_layer_1: ERA5-Land volumetric soil water layer 1 (0-7cm) [m3/m3]\n",
            "volumetric_soil_water_layer_2: ERA5-Land volumetric soil water layer 2 (7-28cm) [m3/m3]\n",
            "volumetric_soil_water_layer_3: ERA5-Land volumetric soil water layer 3 (28-100cm) [m3/m3]\n",
            "volumetric_soil_water_layer_4: ERA5-Land volumetric soil water layer 4 (100-289cm) [m3/m3]\n",
            "total_precipitation: Total precipitation [mm]\n",
            "potential_evaporation: ERA5-Land Potential Evapotranspiration [mm]\n",
            "streamflow: Observed streamflow [mm/d]\n",
            "" ;
        :Timezone = "America/Eirunepe" ;
        :Source = "All forcing and state variables are derived from ERA5-Land hourly by ECMWF. Streamflow data was taken from the CAMELS-BR Dataset by Chagas et al. (2020)." ;
}

this part is missing from the files listed in the .txt file I shared.