bleutner / RStoolbox

Remote Sensing Data Analysis in R 🛰
260 stars 82 forks source link

0-byte padding in LT52240631988227CUB02_MTL.txt #77

Closed barryrowlingson closed 2 years ago

barryrowlingson commented 2 years ago

Is there a reason why the sample file LT52240631988227CUB02_MTL.txt is padded out to 64k with zero bytes? Here's the start and end of a hex dump of the file recently downloaded from github, its the same with the file installed from the CRAN package:

$ od -x  LT52240631988227CUB02_MTL-github.txt
0000000 5247 554f 2050 203d 314c 4d5f 5445 4441
0000020 5441 5f41 4946 454c 200a 4720 4f52 5055
0000040 3d20 4d20 5445 4441 5441 5f41 4946 454c
0000060 495f 464e 0a4f 2020 2020 524f 4749 4e49
[etc]
0012340 3d20 4c20 5f31 454d 4154 4144 4154 465f
0012360 4c49 0a45 4e45 0a44 0000 0000 0000 0000
0012400 0000 0000 0000 0000 0000 0000 0000 0000
*
0177760 0000 0000 0000 0000 0000 0000 0000 0000
0177777

Seems weird, and throws up problems when trying to grep for patterns in it, since grep thinks its a binary file, requiring a -a option to get anything out:

$ grep PROJECTION ./LT52240631988227CUB02_MTL-github.txt 
Binary file ./LT52240631988227CUB02_MTL-github.txt matches
$ grep -a PROJECTION ./LT52240631988227CUB02_MTL-github.txt 
    CORNER_UL_PROJECTION_X_PRODUCT = 486600.000
    CORNER_UL_PROJECTION_Y_PRODUCT = -375000.000
    CORNER_UR_PROJECTION_X_PRODUCT = 719100.000
    CORNER_UR_PROJECTION_Y_PRODUCT = -375000.000

Download error? File corruption? Or is the standard to pad files out to 64k boundaries because reasons?

bleutner commented 2 years ago

Took me a while to figure this out. Current MTL files do no longer have this padding (since a format revision in 2014). The file packaged in RStoolbox, however, is still in the legacy MTL format, which had a fixed size padding.

The padding "was historically used to distribute a fixed-size MTL to support the HDF4 format" see USGS news here