Closed godber closed 9 years ago
Check out the adding-cubefile
branch, your original pysis
cubefiles test passes on that branch.
What's up with the shape being (bands, lines, samples)
? Is that an ISIS-ism? We end up with single band images that are (1, 1024, 2014)
that need to be squeeze()
ed before displaying or saving as image file.
I am accustomed to using (lines, samples, bands)
and in the case where bands = 1
the shape is just 2D (lines, samples)
.
As far as I know, that's how people think about cube files... @sbraden might have more insight though.
Also by the way, this looks awesome! :sparkles:
Looking through the code, the (bands, lines, samples)
shape order is due to how the BandSequential
(and to a lesser extent Tile
) format is laid out on disk:
(band0, line0, sample0) ... (band0, line0, sampleN)
(band0, line1, sample0) ... (band0, lineN, sampleN)
(band1, line0, sample0) ... (bandN, lineN, sampleN)
To do (lines, samples, bands)
, you would have to do quite a bit of copying instead of the simple:
data = numpy.fromfile(stream, self.dtype, self.size)
return data.reshape(self.shape)
My only comment is that the order that most people think about cubes is (sample, line, band), but to avoid a lot of copying the order is as wtolson described above. ISIS cube documentation online suggests that the order is (sample, line, band).
ref: http://isis.astrogeology.usgs.gov/documents/LogicalCubeFormatGuide/LogicalCubeFormatGuide.html
The PDS3 Standards Reference says it supports three storage orders (see section A.25.3.1 Storage Orders):
(SAMPLE, LINE, BAND)
– Band Sequential (BSQ)(SAMPLE, BAND, LINE)
– Band Interleaved by Line (BIL)(BAND, SAMPLE, LINE)
– Band Interleaved by Pixel (BIP)However, the way they are laid out on disk is reversed in terms of the numpy shape, meaning what @godber is accustomed to, (lines, samples, bands)
, would be "Band Interleaved by Pixel".
We could leave this mess as an exercise to the reader, but that seems mildly cruel to me. Our other options would be to translate access on the fly to the data, which to me seems error prone and full of corner cases for iteration, masking, etc. Or my recommendation, reformat the data as we read it in into a standard (maybe configurable) format at a performance loss.
@wtolson agreed on your last point, we should transform all data into the form that would be most useful directly by numpy
's image tools I would think. Wouldn't changing the .data
format impact the existing pysis
users if we were to later utilize this class in pysis
(which I consider to be a near term goal).
That makes sense to me. I assume you're refrencing scipy.ndimage. Is that correct?
That is probably the same as what I mean. But I think we should make it so that we can do
import matplotlib.pyplot as plt
from PlanetaryImage import Pds3Image
pds3img = Pds3Image.open('radimage.IMG')
plt.imshow(pds3img.data)
Similar to what is shown here: http://matplotlib.org/users/image_tutorial.html
Ultimately, we should always err on the side of being pythonic and interacting easily with the already vast body of python image processing tools. This is my mantra.
I had also thought, for compatibility purposes, we could reimplement data
as a property the converts back to the original layout and make the new layout an attribute named image
. Though maybe "configurable" is they way to go.
Here is a hastily implemented version of the PDS3 image parser derived from your cubefile code. There is a parent class
PlanetaryImage
and a subclassPds3Image
. YourCubeFile
would inherit fromPlanetaryImage
. There are a few methods that still need to be copied intoPlanetaryImage
. Do you think there is sufficient overlap to justify this?There are some minor issues with the implementation that I am still looking into.