spectralpython / spectral

Python module for hyperspectral image processing
MIT License
571 stars 139 forks source link

Unable to parse bad band list (bbl) in header as integers. #67

Closed lewismc closed 4 years ago

lewismc commented 7 years ago

Hi Folks, I'm back again, still working with AVIRIS-NG data. When running a mineral classification task I receive the following output

Unable to parse bad band list (bbl) in header as integers.

The data I am using can be found at ftp://avng.jpl.nasa.gov/AVNG_2015_data_distribution/L2/ang20150420t182050_rfl_v1e/ang20150420t182050_corr_v1e_img.hdr Right enough, the bbl values are manifested as double type.

Is there justification here for making spectral more flexible/adaptive to different types of bbl? Thanks

tboggs commented 7 years ago

The ENVI documentation states that these values are supposed to be multilpiers for bad bands but no type is specified, which suggests that they aren't necessarily expected to be integer values. I saw your pull request just now. I'm afk for a couple days. When I get back, I'll look at this issue. It probably makes sense to just read the values as floating point.

On Jun 29, 2017 1:59 PM, "Lewis John McGibbney" notifications@github.com wrote:

Hi Folks, I'm back again, still working with AVIRIS-NG data. When running a mineral classification task https://github.com/capstone-coal/pycoal/blob/master/pycoal/mineral.py#L22 I receive the following output

Unable to parse bad band list (bbl) in header as integers.

The data I am using can be found at ftp://avng.jpl.nasa.gov/AVNG_ 2015_data_distribution/L2/ang20150420t182050_rfl_v1e/ ang20150420t182050_corr_v1e_img.hdr Right enough, the bbl values are manifested as double type.

Is there justification here for making spectral more flexible/adaptive to different types of bbl? Thanks

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/spectralpython/spectral/issues/67, or mute the thread https://github.com/notifications/unsubscribe-auth/AEvuIebt-fzILHRKZMYn_q5phVAPPvMRks5sI-YNgaJpZM4OJqid .

tboggs commented 7 years ago

I've looked into this a bit. It still isn't entirely clear whether bbl should be an int or float. I've found a few references to it in the Harris documentation.

Here describes bbl as

Lists the bad band multiplier values of each band in an image, typically 0 for bad bands and 1 for good bands.

Use of the word "typically" is confusing. Does an atypical case mean non-integer or integer other than 0 or 1?

Here says it should be an "integer array":

Specify an integer array with the same number of elements as BAND_NAMES, where 0 indicates a bad band and 1 indicates a good band. This property is typically used with hyperspectral imagery.

It states "integer array" but it isn't clear if that means integer data type or floats with integer values. It also isn't clear whether 0 and 1 are the only allowed values.

Lastly, here it specifically states that the array should contain ones and zeros, though it still does not specify a data type:

Use this keyword to specify a named variable containing an array of ones and zeros representing the good and bad bands, respectively. The number of elements in BBL must be equal to the number of bands in the image. If no list of bad bands is available, BBL returns a value of -1.

I'm curious to see the header output generated by ENVI with a bbl list specified to see with it produces "1" or "1.0" for good bands.

Fixing this in spectral should be simple. It just needs to be decided whether the bbl list should be represented with type int or float and if float, should the value be restricted to 0's and 1's. If we go with float, it's just a matter of changing line 387 in envi.py from

h['bbl'] = [int(b) for b in h['bbl']]

to

h['bbl'] = [float(b) for b in h['bbl']]

If we want to maintain it as an int, then it could be changed to

h['bbl'] = [int(float(b)) for b in h['bbl']]

(though that would silently round any float values between 0 and 1 down to zero).

Anyone have opinions on this?

lewismc commented 7 years ago

@davidraythompson do you have any opinion/knowledge of this? The issue at hand is explained above. Thanks for any comments.

donm commented 7 years ago

I asked a former ITT VIS employee, and he said that he has only ever seen 0s and 1s in the bbl entry. Since this is the first time any of us are seeing 0.0 and 1.0 floats in the bbl, even that is probably pretty rare. But I'll ask around at work some more tomorrow.

Just to confuse things a bit more, I'll point out that the bbl is really more of a "good bands list" than a "bad bands list." Internally in SPy, I don't know if it makes more sense to maintain the ENVI name and semantics or to switch to something like a "good bands" list that holds True and False values. If another spectral image format doesn't match ENVI and SPy is made to support reading that format, would we want to force the internal representation of the other format to match all of ENVI's idiosyncrasies?

If sticking with ENVI patterns, though, I'd vote for reading the floats and converting them to ints.

tboggs commented 7 years ago

@donm I don't know how prevalent the use of bbl is a this time. SPy didn't support handling it explicitly until a commit on 26 March and didn't make into a release until 4 June (v0.19) so there wouldn't have been any issues for SPy users then. Since this is coming from AVIRIS-NG, I suspect we'll start seeing it more often, assuming lots of people are using AVIRIS-NG and it always records it as a float (@lewismc - have you seen both int and float data types for the bbl array coming from AVIRIS-NG?).

I'm with you regarding the variable name. It would have been more appropriately named gbl.

The trade-off here seems to be between what is quick-and-easy (by "easy" I mean "low maintenance") and what provides more clarity and flexibility within SPy.

Reading bbl as floats would fix the current problem, wouldn't break anything currently in SPy, and wouldn't break later if someone used non-integral values, like 0.3 (though I don't know why you would want to do that).

Converting bbl to boolean values in a good band list would make things clearer within SPy and would simplify indexing image data. This would break with the definition referenced in my first link above, which describes the values as "multiplier values", though I'm not aware of any circumstance where they're actually used as such. The downside to this approach is the need to do the book-keeping to ensure that the good band list internal to SPy gets written out as a bbl when someone later saves the image in ENVI format.

Slightly off-topic (but not entirely) - I've been considering the possibility of using a pandas DataFrame to manage all of the band-related metadata because it would make it much easier to manage existing band metadata, add new band-related attributes, and perform tasks like filtering images by band. I don't know if that would influence the choice of the two options, but I just wanted to mention it (I will also open a separate issue to discuss that topic).

lewismc commented 7 years ago

Do you want me to edit #68 with float-to-int conversion or do you want to take care of it @tboggs ?

tboggs commented 7 years ago

I guess go ahead and do that, referencing this issue. We'll keep the issue open, pending feedback from @donm and whatever we decide is the ultimate solution. If we do end up making another change, we'll make sure 0.0 and 1.0 are still supported moving forward.

lewismc commented 4 years ago

HI @tboggs going to close this one off to clean up.