OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.78k stars 2.5k forks source link

PDS4 support for external file header size in label #1832

Closed thareUSGS closed 4 years ago

thareUSGS commented 5 years ago

Expected behavior and actual behavior.

When converting to a PDS4 format, and pointing to a non-raw file, there should be a short section added under the <File></File> section called <Header> to simply define the header type of the file for which the PDS4 label is pointing into. more: http://sbndev.astro.umd.edu/wiki/Filling_Out_the_Header_Data_Structure

I am not too concerned about listing the optional description but could simply "This section of the file contains the GeoTIFF header". I assume we are safe to call GDAL's GeoTIFF v1.1.

Here are some XML examples in the wild: https://sbnarchive.psi.edu/pds4/near/nearmsi.shapebackplane/data/2001/003/

<Header>
  <offset unit="byte">0</offset>
  <object_length unit="byte">2880</object_length>
  <parsing_standard_id>GeoTIFF 1.1</parsing_standard_id>
  <description>
      GeoTIFF header. The GeoTIFF format is used throughout the geospatial and science communities to share geographic image data. 
  </description>
</Header>

For other supported formats, this might be:

ISIS 3 VICAR 3 PDS 3 FITS 3

it is kind-of crazy all these are at version 3...(!)

Steps to reproduce the problem.

gdal_translate -of PDS4 -co IMAGE_FORMAT=GEOTIFF LDEM_4.LBL LDEM_4.xml

Operating system

Ubuntu 18 64 bit

GDAL version and provenance

GDAL 3.0.1, released 2019/06/28, conda-forge

rouault commented 5 years ago

I assume we are safe to call GDAL's GeoTIFF v1.1.

Actually GDAL outputs GeoTIFF v1.0 by default, unless there's a 3D component in which case it goes to GeoTIFF 1.1 by default. Shouldn't "GeoTIFF" be enough ?

The concept of "header" for TIFF/GeoTIFF is a bit less obvious than for the other binary formats (ISIS 3, etc..), since in TIFF you have lots of sections, and most of them possibly spread at random locations (a 4-byte TIFF header, a pointer to the first IFD, the IFD descriptor, the values of TIFF tags that don't fit in 4 bytes, the imagery content itself), but given the layout adopted by GDAL/libtiff, I guess we can expose this is a header.

thareUSGS commented 5 years ago

This is mostly for making sure the PDS4 label understands why it needs to skip a part of the file. This won't tell a person how to read it, just that there is something there.

How about simply listing both TIFF and GeoTIFF?

TIFF x.x/GeoTIFF 1.0

or I think we can get by without the version, so yes, I would be happy with:

TIFF/GeoTIFF

-- unless there is a good reason to list the TIFF version.

rouault commented 5 years ago

unless there is a good reason to list the TIFF version.

Depends what people will do with that. But the TIFF version is stuck forever at 6.0. But for a more than 4GB TIFF file, a BigTIFF file will be produced by GDAL, and this could cause issues to pure TIFF readers. Should we advertize BigTIFF/GeoTIFF for such large files ?

thareUSGS commented 5 years ago

If possible, and a BigTIFF is created, then yes I agree "BigTIFF/GeoTIFF" should be used.

rouault commented 4 years ago

@thareUSGS looking at the schematron for the latest PDS4 version in https://pds.nasa.gov/datastandards/schema/released/pds/v1/PDS4_PDS_1C00.sch , I see that there's a fixed list of values for the parsing_standard_id: '7-Bit ASCII Text', 'CDF 3.4 ISTP/IACG', 'FITS 3.0', 'ISIS2', 'ISIS2 History Label', 'ISIS3', 'PDS DSV 1', 'PDS ODL 2', 'PDS3', 'Pre-PDS3', 'UTF-8 Text', 'VICAR1', 'VICAR2' . Which doesn't include GeoTIFF. Do you see this has a problem if we extend it ?

rouault commented 4 years ago

Closely related to that, the following should be implemented "the GDAL PDS4 driver will be extended with a CREATE_LABEL_ONLY=YES creation option to create a PDS4 .xml label to existing image files in the formats GeoTIFF, FITS, ISIS3, VICAR, and ENVI. The PDS4 driver will check that the existing image file has the required characteristics to be considered as a PDS4 image file, that is a raw pixel stream with a supported interleaving scheme. In case of compatibility, the driver will error out."

PDS3 to be added if possible

thareUSGS commented 4 years ago

GeoTiff is being currently pushed through as a recognized format. So having a lunar example in the latest GeoTiff spec (v1.1) will help this cause.

I don't see any reason for the PDS management board to shoot this down. I will be presenting to the group in early November although a white paper on the topic and the acceptance vote might come earlier. Essentially allowing support for a "raw" GeoTiff in PDS4 is exactly the same as already allowing for CDF (and to a lesser extent the other formats, which can make the case they have an ASCII header). But it is not the goal for PDS4 to understand the GeoTiff header, just the fact it can be easily skipped.

rouault commented 4 years ago

I don't see any reason for the PDS management board to shoot this down

Understood. My question was more to make you aware of this schematron rule, in case there would be superstrict checks done on PDS4 .xml labels generated by GDAL. No problem for me to use "TIFF/GeoTIFF" and "TIFF/BigTIFF" as we discussed

thareUSGS commented 4 years ago

No I wasn't aware those were explicitly listed. With the acceptance GeoTiff, I will also start pushing for those additions to the schema.