ceos-org / ceos-ard

Repository for CEOS Analysis Ready Data (CEOS-ARD), including Product Family Specifications (PFSs)
9 stars 0 forks source link

SAR: Req. 1.7.9 - File Header Size #27

Open m-mohr opened 2 months ago

m-mohr commented 2 months ago

Req. 1.7.9 requires the file header size. Is that really needed? Shouldn't file format readers abstract that away? I've never seen anyone (who's not implementing GDAL or any other generic file format driver) handling file header sizes specifically.

I see in the XML metadata: grafik but that's not really reflected in the PDF. There it only states "(if applicable)".

Edit: It's probably my lack of SAR expertise, but how can there be RAW for products such as NRB, POL, etc? Isn't this contradicting?

cc @akerosenqvist

avalentino commented 2 months ago

@akerosenqvist I think that we had a discussion about this in one of our meetings but I don't remember the conclusion Honestly I tend agree with @m-mohr

akerosenqvist commented 2 months ago

Yes, we had this discusssion and agreed that since we are not prescribing what data formats providers can and cannot use - meaning that while unlikely, a provider COULD chose RAW format - we needed to keep provisions for header size and border pixels in the PFS and metadata spec. However it clearly says "if applicable". In the STAC you can probably just ignore it if it poses a problem.

m-mohr commented 2 months ago

Thank you for the responses.

Similarly, aren't the requirements in 2.1 for byte order and maybe bits per sample and data type really relevant? They should be handled by the software automatically and be exposed through the file format. I could see some use for data type, but I've never seen any user worrying about the byte order, so I'm wondering whether that's really needed in metadata.

akerosenqvist commented 2 months ago

It relates to the point discussed above and at least for RAW, the byte order matters. So unless there is a problem to include byte order in the STAC, please do so.

m-mohr commented 2 months ago

Thanks. So it's only really relevant for RAW? In this case it might make the work easier if we say in the STAC profile what we don't support RAW for now and keep the profile/extension a bit more simple. It's already being hard to fill all the missing gaps in the extensions for other requirements.

johntruckenbrodt commented 2 months ago

Wasn't RAW referring to the ENVI format? In this format there are two files, a plain binary block and a metadata "header file". The byte order is relevant for GeoTIFF as well, it is just a little more hidden to the user because it is not written in a separate metadata file. Such binary block in big endian format is also usually written by e.g. GAMMA. These files can however easily be appended with an ENVI header file. See e.g. here:
https://pyrosar.readthedocs.io/en/latest/api/gamma.html#pyroSAR.gamma.ISPPar.envidict (export ENVI-compliant metadata from GAMMA isppar format) https://spatialist.readthedocs.io/en/latest/spatialist.html#module-spatialist.envi (manipulate ENVI header files)

I am wondering, since with STAC there is a whole ecosystem of tools, might it be worthwhile to tighten the requirement and only allow GDAL-readable file formats? This way we would no longer have to detail the required metadata because users would need to check themselves whether the file is readable.

akerosenqvist commented 1 month ago

I am wondering, since with STAC there is a whole ecosystem of tools, might it be worthwhile to tighten the requirement and only allow GDAL-readable file formats? This way we would no longer have to detail the required metadata because users would need to check themselves whether the file is readable.

I agree with the suggestion to allow only GDAL-readable file formats in the PFS, including COG, NetCDF, Zarr (but not RAW)

akerosenqvist commented 1 month ago

@mattsymbios Matt - can you add a SAR tag to this issue?

m-mohr commented 1 month ago

Yeah, I mean it's all about ARD here and for me the file format is also part of that. RAW doesn't sound very analysis ready to me, especially if it's not readable by default using de-facto standard software such as GDAL.