Open dlebauer opened 7 years ago
@yanliu-chn could you please work on defining the geotiff standard format?
The code used in terrautils will enforce a standard method for generating geotiffs: https://github.com/terraref/computing-pipeline/issues/308
Will need help from others to enforce CF standards however.
Who takes the lead on this and when can it be finished (please add a milestone for May or June or ...)
Based on other discussions I think it would make sense for @craig-willis to take the lead on this, but I will talk more about this/terrautils at the meeting today.
@dlebauer Is there anything specific you're looking for in terms of metadata? Looking at the EnvironmentLogger and hyperspectral nc files, aside from variables I see primarily sensor information.
See also related issue exists for the point cloud data. https://github.com/terraref/computing-pipeline/issues/257. My comment there was "Goal is for (raster, point cloud) files to differ where it is useful, but have similar interfaces where applicable."
Here are some examples:
Thanks, @dlebauer.
I've been looking at the CF conventions for time (and I believe you had feedback on the time_utc variable we're currently using). CF conventions define a "time coordinate" (http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#time-coordinate), but not a timestamp in the way we've defined. Is it sufficient to use the UTC timestamp with offset ISO-8601 subset? Is the field name "time_utc" problematic?
$ gdalinfo file.tif:
...
Coordinate System is:
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433],
AUTHORITY["EPSG","4326"]]
...
time_utc
I've always found the CF convention of time (units of <interval> since <reference date>
) to be cumbersome, so I have no issue with using a timestamp. The only issue I see with time_utc is that it is more difficult for users to interpret than the local time, which can be represented in ISO-8601 format as YYYY-MM-DDTHH:MM-HH:MM like 2007-04-05T12:30-02:00. My understanding is that this is how we are storing data in the start_time and end_time field in geostreams.
My original vision was that using gdal_translate from .tiff to .nc or .nc to .tiff would generate files with similar structure. So if this were from the FLIR camera, there would be a field with information about the variable represented by the raster layer in the image - name = temperature, units = C, dimensions = lat,lon etc.
But for now, the key will be to have an OGC-compliant file with the required information in external metadata.
What I had in mind for standards compliance was something like what is described in Annex A ("Annex A lists the conformance tests which shall be exercised on any software artifact claiming to implement GMLCOV for GeoTIFF") of the OGC GeoTIFF standards document 12-100r1_OGC_GML_ApplicationSchema-Coverages-_GeoTIFF_Coverage_Encoding_Profile.pdf .
But we should also focus on what is useful / necessary to meet the end-user needs (which I think can be met with well structured file-associated metadata in Clowder and geostreams).
For reference, here is an overview of the information in a MODIS hdf5 dataset. Like the netcdf, it also contains information about each layer in the file, the bounding box, the processing provenance, quality control &c. https://ladsweb.modaps.eosdis.nasa.gov/api/v1/filespec/collection=6&product=MOD13Q1.
When I ask MODIS for geotiff data these fields do not appear to propagate into metadata that a program like ArcGIS can read (or exif for that matter) so I am not sure if it is dropped. e.g. GTiff.tar.gz from https://modis.ornl.gov/subsetdata/23Aug2017_17:04:58_019465197L35.958767L-84.287433S25L25_MOD13Q1/
(here is a datset that covers the field scanner https://modis.ornl.gov/subsetdata/23Aug2017_17:34:58_983339455L33.07558L-111.97489S9L9_MYD13Q1/)
@dlebauer Thanks for the details. A few comments/questions:
geoTIFF files should have useful metadata that is consistent with the CF approach used for met and hyperspectral data; should also comply w/ existing OGC standards
Completion Criteria