Closed trey-stafford closed 3 weeks ago
@eigenbeam might have some additional notes or memories on this?
@trey-stafford please feel free to reach out to me for any help with your questions. I am happy to help and probably your best option at this point. I’ve worked with @eigenbeam on the question of reference datums for ATM many years ago when he developed valkyrie.
But to get your question: Not really important, but I think you are confusing QFIT and ICESSN data products/formats – and keep in mind these formats/data products were designed before 1993 and were not intended to be distributed by a DAAC… All QFIT files have headers, whether they are in the legacy flat binary format or the OIB era HDF5. As you say they contain information about the GPS trajectory that was used to reference the lidar data and that has the ITRF epoch in its file name. Back then ITRF epochs had two digit years, which was obviously changed to 4 digits after Y2K. I think you mean legacy ICESSN data without header (https://nsidc.org/sites/default/files/blatm2-v001-userguide.pdf). These files come with a summary sidecar file that includes the QFIT header with the trajectory (and ITRF) information you show in your post. The same summary file also lists the aircraft trajectory and ITRF epoch further down. Look for lines like that:
Aircraft trajectory from File 070511_aa_l12_jgs_itrf00_19jun07_6138
As said, please feel free to reach out to me with questions. Glad to see that you are working on this.
Hi @mstudinger , thanks for your input!
An example data file I'm having trouble reading a header from is BLATM1B_20021128atm2_193456jr.qi.
We have some code, here to read the qfit file header, and it currently returns an empty string from this file. I do not see any sidecar files for BLATM1B_20021128atm2_193456jr.qi
aside from the xml that contains the granule metadata (does not contain references to an ITRF).
Maybe there's something wrong with our header reading code. I'll give this another look today.
Note that I have confirmed that there are other qfit files in the BLATM1B dataset that we can successfully read the header including ITRF info. E.g., BLATM1B_20070502_184508.atm4bT2.rangeExample.qi
Ok, I think I've confirmed that our code is to blame. The file I mention above, BLATM1B_20021128atm2_193456jr.qi, does appear to have header information including the ITRF. I'll track down the issue and apply a fix, which should eliminate the need for the hard-coded ranges in #31 !
$ head -n 50 downloaded-data/BLATM1B/BLATM1B_20021128atm2_193456jr.qi | grep -ai itrf
v../021128_aa_l12_cfm_itrf00_27mar03_vpalm+adel
Ok, I believe c24993b will resolve but want to do a little more testing to confirm!
Ok, I think this is working as expected now. I have been able to extract the ITRF from a number of sample qfit files, including those that we had previously been unable to read.
If we run into other cases where our code can't read an ITRF from a qfit file, we can open a new issue to address, but I'm hoping this covers everything!
Fix merged into main
with #31
The header usually contains (we think, see below) the ITRF for each granule, which is how we prefer to set the source ITRF. The files without a header have to fall back on a hard-coded list of ITRFs for various date ranges.
Relevant PR: https://github.com/nsidc/iceflow/pull/31
qfit file headers
For qfit files that do have headers, we extract the ITRF from a string in the header that we're not really sure much about. There is documentation about the qfit file format here: https://nsidc.org/sites/nsidc.org/files/files/ReadMe_qfit.txt, but it doesn't say much about what the header contains aside from:
An example qfit file header is given below:
The ITRF for this granule is determined from this string:
which appears to be an input filename or program. At some point we thought this was enough to indicate the ITRF for the data, but there is no clear documentation (we take the
_itrf05_
bit to mean this granule uses ITRF2005).Hard-coded ITRF date ranges
The hard-coded ITRF ranges have questionable provenance. They come from
valkyrie
, and there's no documentation around how they were found.The documentation for the source datasets isn't very helpful:
It is not clear what the "convention" was during data collection.
I tracked down some old notes from the project about this, and they suggest that we figured out the ranges by looking at the headers, but nothing about how we determine the correct ITRF for files without headers. There appear to be at least a couple of dates worth of data that do not conform to expectations. 2022-11-22 and 2022-12-14 use ITRF97 instead of ITRF2000, which is used for all other dates between 2001-12-18 to 2007-05-11. So...do we know for sure that the files without headers actually use the ITRF indicated in the hard-coded list? Not sure.
Action
We should dig into this further, and perhaps reach out to PIs associated with these datasets for clarification. Questions we need to answer are: