NEONScience / NEON-Data-Skills

Self-paced tutorials that review key data literacy concepts and data analysis skills. Published materials can be found at:
https://www.neonscience.org/resources/learning-hub/tutorials
GNU Affero General Public License v3.0
77 stars 88 forks source link

HDF5 in HDFView vs R - dimension confusion #616

Open gklarenberg opened 2 years ago

gklarenberg commented 2 years ago

Not sure if this is something that can be solved, but maybe mention can be made of this in the HDFView and/or HDF5 and R tutorials: HDFView shows the dimensions in a different order than R.

For the file NEONDSImagingSpectrometerData.h5 it shows 426 x 501 x 477 (wavelength, line, sample) in HDFView, and you need to select dim1 and dim2 to get a proper picture - as the tutorial indicates. However in R, dim() will tell you the dimensions are 477 502 426 and you need the third index to select a band:

band_58 <- h5read(file = "NEONDSImagingSpectrometerData.h5", 
                  name = "Reflectance",
                  index=list(NULL,NULL, 58))

The same for the file NEON_D17_SJER_DP3_257000_4112000_reflectance.h5. In HDFView, it's 1000 x 1000 x 426 (line, sample, wavelength), but when reading it into R, dim() says 426 1000 1000, and you need the first index to select a band.

It's incredibly confusing for learners that are new to HDF5 - and I am very adamant on using HDFView first before having them use R. The visual component really helps them understand the structure of the data. But the dimension issue impedes their learning. I am sure it's a software issue, HDFView vs rhdf5 (so not fixable by us), but it should be mentioned :)

cklunch commented 2 years ago

@bhass-neon Can you take a look at this? Should be a simple update to add a bit more explanation.

bhass-neon commented 2 years ago

@cklunch yes the dimensions are not always in the same order, I can explain. I can't see if this is referring to a specific lesson though? Not totally sure what needs to be updated. I'll search GitHub for that code snippet.