Closed pdurbin closed 1 year ago
@atrisovic and I just found a nice example that shows a specific bounding box:
<attribute name="geospatial_lat_min" value="25.066666666666666" />
<attribute name="geospatial_lat_max" value="49.40000000000000" />
<attribute name="geospatial_lon_min" value="-124.7666666333333" />
<attribute name="geospatial_lon_max" value="-67.058333300000015" />
This is from https://www.northwestknowledge.net/metdata/data/bi_2023.nc and is currently published at https://dev1.dataverse.org/file.xhtml?fileId=30&version=1.0
Next steps:
From our design doc, we'll look for geospatial files here too:
Use cases evident by using a variety of NetCDF/HDF5 data from these examples:
Surface PM2.5: https://sites.wustl.edu/acag/datasets/surface-pm2-5/#V5.GL.03 GridMET data: https://www.northwestknowledge.net/metdata/data/ Global Workshop on Earth Observation https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/OYBLGK NetCDF data from Harvard Dataverse: https://dataverse.harvard.edu/dataverse/harvard?q=*.nc
Here is an example repository and Google Colab notebook to map NetCDF files to the current implementation of the geospatial
metadata block using EasyDataverse. In order to use it, you need to supply your API Token and target collection name.
Next sprint:
@JR-1991 thanks! I pointed your notebook at my test server and it pulled populated the bounding box:
It's pretty straightforward to pull attributes out using the library we added in #9152.
public static Map<String, String> parseGeospatial(NetcdfFile netcdfFile) {
Map<String, String> geoFields = new HashMap<>();
Attribute westLongitude = netcdfFile.findGlobalAttribute(WEST_LONGITUDE_KEY);
Attribute eastLongitude = netcdfFile.findGlobalAttribute(EAST_LONGITUDE_KEY);
Attribute northLatitude = netcdfFile.findGlobalAttribute(NORTH_LATITUDE_KEY);
Attribute southLatitude = netcdfFile.findGlobalAttribute(SOUTH_LATITUDE_KEY);
geoFields.put(DatasetFieldConstant.westLongitude, getValue(westLongitude));
geoFields.put(DatasetFieldConstant.eastLongitude, getValue(eastLongitude));
geoFields.put(DatasetFieldConstant.northLatitude, getValue(northLatitude));
geoFields.put(DatasetFieldConstant.southLatitude, getValue(southLatitude));
System.out.println("https://linestrings.com/bbox/#"
+ geoFields.get(DatasetFieldConstant.westLongitude) + ","
+ geoFields.get(DatasetFieldConstant.southLatitude) + ","
+ geoFields.get(DatasetFieldConstant.eastLongitude) + ","
+ geoFields.get(DatasetFieldConstant.westLongitude)
);
return geoFields;
}
I think I got the order right to see the bounding box at https://linestrings.com/bbox/#-124.7666666333333,25.066666666666666,-67.058333300000015,-124.7666666333333
@atrisovic and I exchanged emails with @plesubc and he has inspired us to use GDAL, ogrinfo or similar to try extracting latitude and longitude from a NetCDF file.
As this is just a spike, some discovery, we're sizing this as 10 or 1 day.
Here's part of the email from Paul (this file happens to span the entire globe):
"Metadata extraction is a relatively simple process assuming you’re using GDAL. The GDAL suite exports file metadata to stdout, so all you really need to do is capture and process the text. Of course, differing formats have differing outputs, because life is never that simple.
So, for example, imagine you downloaded a netcdf from here:
https://data.ceda.ac.uk/badc/ukmo-hadobs/data/derived/MOHC/HadOBS/HadEX3/v3-0-2 (HadEX3-0-2_cwd_ann_1901-2018.nc). This isn’t some special data set, it’s the result of a google for spatial netcdf files.
Basically, filtering this file through ogrinfo (one of the utilities in GDAL), you get something like this as output: