spectralpython / spectral

Python module for hyperspectral image processing
MIT License
571 stars 139 forks source link

Printing BipFile vs ImageArray #79

Closed spencermathews closed 4 years ago

spencermathews commented 6 years ago

Executing print() on a BipFile prints out a nice summary that looks like:

Data Source:
# Rows:
# Samples:
# Bands:
Interleave:
Quantization:
Data format:

However when working with ImageArray objects print() just outputs the underlying ndarray.

It would be nice to have as much consistency as possible between these two representations, as well as an equally easy way to output this summary without having to pull out the data attributes individually. Am I missing something?

Note: running spectral under python 3.

tboggs commented 6 years ago

Can you clarify what you mean by "an equally easy way to output this summary without having to pull out the data attributes individually"?

spencermathews commented 6 years ago

I'm just asking why the print() output is different for BipFile and ImageArray since they seem logically eqivalent, and suggesting that it be. Until print(imagearray) displays this nice summary, as far as I can tell one has to grab the nrows/ncols/nbands/etc. attributes individually in order to get an overview of the object.

Maybe this belongs in the Image class, maybe not—I'm not familiar enough with the codebase to really comment on that.

On a related note, the definition of ImageArray states "This class inherits from both numpy.ndarray and SpyFile, providing the interfaces of both classes." I'm confused since ImageArray does inherit from Image but does not appear to inherit from SpyFile nor, and as this issue is highlighting, does it match the SpyFile (or at least BipFile) interface.

tboggs commented 6 years ago

That is a typo in the the documentation. As you mentioned, ImageArray inherits from Image rather than SpyFile.

Regarding the behavior, it is intentional for ImageArray to print as an ndarray, rather than an Image or SpyFile. I don't recall all the reasons that decision was made but one was to make it easier to work with the actual data. In most cases, subscripting an ImageArray returns another ImageArray. If I do this:

>>> data = image.load()
>>> data[100, 100]

then I expect to see the data values for pixel (100, 100). But if ImageArray were to print like a SpyFile, then the command above would print a summary of the data file, which is not the intent.

With regard to your desire to see the metadata associated with an ImageArray, there are a few ways to get the relevant metadata. The info method will print the number of rows, columns, and bands in the data, as well as the dtype. Of course, the shape attribute has the number of rows, columns, and bands as well. The file name and interleave aren't printed by info because an ImageArray is no longer considered to be associated with the source file and the interleave of an ImageArray is always BIP (regardless of the source file).

I'll leave this issue open until the documentation is fixed. There may also be some nuances of the __str__ vs. __repr__ methods that should be revisited so I'll take a look at those as well.