spectralpython / spectral

Python module for hyperspectral image processing
MIT License
559 stars 137 forks source link

OSError: Unable to determine file type or type not supported for bip image #121

Closed jankaWIS closed 1 year ago

jankaWIS commented 3 years ago

I have fluorescent microscopy data saved as BIP file and I would like to load it to python. I was following the documentation with:

from spectral import *
img = open_image('path_to_file/Frgrnd.bip')

but I'm getting an error:

~/anaconda3/lib/python3.8/site-packages/spectral/spectral.py in open_image(file)
    117         pass
    118 
--> 119     raise IOError('Unable to determine file type or type not supported.')
    120 
    121 

OSError: Unable to determine file type or type not supported.

Where is the problem and how can I fix it? I'm using: spectral 0.22.1 last updated: Fri Jan 29 2021

CPython 3.8.5 IPython 7.18.1 watermark 2.0.2

Thank you.

tboggs commented 3 years ago

If it is an ENVI-formatted file, then you should pass the name of the header file as the argument to open_image.

jankaWIS commented 3 years ago

If you mean:

import spectral.io.envi as envi
envi.open('path_to_file/Frgrnd.bip')

that gives:

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
~/anaconda3/lib/python3.8/site-packages/spectral/io/envi.py in read_envi_header(file)
    119     try:
--> 120         starts_with_ENVI = f.readline().strip().startswith('ENVI')
    121     except UnicodeDecodeError:

~/anaconda3/lib/python3.8/codecs.py in decode(self, input, final)
    321         data = self.buffer + input
--> 322         (result, consumed) = self._buffer_decode(data, self.errors, final)
    323         # keep undecoded input until the next call

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 17: invalid start byte

During handling of the above exception, another exception occurred:

FileNotAnEnviHeader                       Traceback (most recent call last)
<ipython-input-5-1933f00510ce> in <module>
      1 import spectral.io.envi as envi
----> 2 envi.open('path_to_file/Frgrnd.bip')

~/anaconda3/lib/python3.8/site-packages/spectral/io/envi.py in open(file, image)
    288 
    289     header_path = find_file_path(file)
--> 290     h = read_envi_header(header_path)
    291     check_compatibility(h)
    292     p = gen_params(h)

~/anaconda3/lib/python3.8/site-packages/spectral/io/envi.py in read_envi_header(file)
    123           'binary file).'
    124         f.close()
--> 125         raise FileNotAnEnviHeader(msg)
    126     else:
    127         if not starts_with_ENVI:

FileNotAnEnviHeader: File does not appear to be an ENVI header (appears to be a binary file).
tboggs commented 3 years ago

No, I mean that ENVI format includes an image file and a separate header file. open_image should be called using the header filename.

jankaWIS commented 3 years ago

I do not think I have that. I only have the bip file (it's not my file, I'm just getting the data).

tboggs commented 3 years ago

The ".bip" extension alone isn't sufficient to understand the format of the image data. At a minimum, the byte order, data type, and image dimensions need to be provided. Here's a minimal ENVI header that you could save to "path_to_file/Frgrnd.hdr" and try to open:

ENVI
samples = 200
lines = 100
bands = 5
header offset = 0
file type = ENVI Standard
data type = 4
interleave = bip
byte order = 0

A few notes about the format and example header:

If you're opening a bunch of those files and they're all structured the same, you could avoid writing the header file each time by hacking together something like this:

from spectral.io.bipfile import BipFile

def open_bip(bipfile):
    class Params:
        pass
    p = Params()
    p.byte_order = 0
    p.filename = bipfile
    p.nbands = <>
    p.ncols = <>
    p.nrows = <>
    p.offset = 0
    p.dtype = np.dtype('i2').str # replace string with appropriate value

    return BipFile(p, {})

Just fill in all the missing values, as appropriate.

jankaWIS commented 3 years ago

Sorry for the bother and thanks a lot for your help. I tried to open the bip file and there was a text header but did not look like what you write. Looks like this:

----------------------------  Date:09/14/2020   Time:15:36:54  ----------------------------
##########User ID:  0 ; Username:   admin ;Session ID:  w2XVkGZkJxkI3CIU9L0E+Bp33Jq81zVs5JDaOh3mjXQ= ###########
----------(Decrypted Session ID:    09/14/202015:36:54)----------
  Image Capture Source: In-Vivo Xtreme BI 4MP
  Camera Serial Number: 7021812
  Username: 
  Annotation:   
  Capture Time/Date:    15:36:53 on 09/14/2020
  Software Version: 7.5.2.22464

     Exposure Type: Standard Exposure
     Exposure Time: 60.000 sec.
     Exposure:  1 of 1
     Total Experiment Duration: 60.00 secs.
     X-binning: 2x Binning
     Y-binning: 2x Binning

     Illumination Source:   Multi-wavelength
     f-Stop:    1.10
     FOV:   190.0 mm
     Focal Plane:   0.0 mm
     Focal Reference:   Tray
     Vertical Resolution:   136 ppi
     Horizontal Resolution: 136 ppi
     Magnification Stage:   No
     Excitation Filter Description: 27. 750
     Emission Filter Description:   6. 830

     Lens Correction Done:  No
     Illumination Correction Done:  No
     Reference File:    NA
     Image Warping: Yes
     Co-registration Correction:    No

     Orientation:   Normal
     CCD Temperature:   0xC7
     Pixel Saturation Threshold:    97425
     Percent of Saturated Pixels:   ~0.00%
     Convert to Pico Watts/mm sq.:  false
     Convert to Photons/sec/mm sq:  false
     Convert to Radiance:   false
     Spectral Normalization:    true
     Lamp Brightness Correction:    true
     Conversion Factor: 1.486612
     Convert To X-Ray Density(XD):  false
     Modality:  Fluorescence
     Sensitivity:   High Speed
     Shutter Closed:    false
     Pre-exposure flush count:  0
     Instant DCR:   false

I have tried just blindly copy this and save as .hdr which gave me

FileNotAnEnviHeader: File does not appear to be an ENVI header (missing "ENVI"               at beginning of first line).

which I fixed by placing ENVI on the first line which in turn gave:

/Users/jan/anaconda3/lib/python3.8/site-packages/spectral/io/envi.py:175: UserWarning: Parameters with non-lowercase names encountered and converted to lowercase. To retain source file parameter name capitalization, set spectral.settings.envi_support_nonlowercase_params to True.
  warnings.warn(msg)

---------------------------------------------------------------------------
MissingEnviHeaderParameter                Traceback (most recent call last)
<ipython-input-5-8567fdbfef23> in <module>
      1 import spectral.io.envi as envi
----> 2 envi.open('/path_to_file/Frgrnd.hdr')

~/anaconda3/lib/python3.8/site-packages/spectral/io/envi.py in open(file, image)
    289     header_path = find_file_path(file)
    290     h = read_envi_header(header_path)
--> 291     check_compatibility(h)
    292     p = gen_params(h)
    293 

~/anaconda3/lib/python3.8/site-packages/spectral/io/envi.py in check_compatibility(header)
    249     for p in mandatory_params:
    250         if p not in header:
--> 251             raise MissingEnviHeaderParameter(p)
    252 
    253     if _has_frame_offset(header):

MissingEnviHeaderParameter: Mandatory parameter "lines" missing from header file.

So I guess 2 questions. Is this the correct header? How do I extract the needed params? And again, thanks a lot for your help! It's not my field and I'm just starting.

tboggs commented 3 years ago

That is not an ENVI-formatted header so it won't work to add "ENVI" at the top. Plus, it is missing all of the required parameters from the example header I posted.

If the sensor is the same as this, then it produces 2048x2048 pixel images so you could try 2048 for both the lines and samples parameters in the header. Do you know how many bands are in the image? If so, and you assume - for the moment - that the data offset is zero, you could compute the size of each data value as

data_size = file_size / (2048 * 2048 * num_bands)

Hopefully, data_size will be exactly 1, 2, 4, or 8. If that is the case, you could then go back to the list of data type codes and try values that correspond to the data size you computed.

jankaWIS commented 3 years ago

Hi, sorry for the delay, took some time to get moving. You were right about the device, it is that one. Besides that I have tried to open the file in ImageJ (FiJi) and it gives 2048x2048 as you write. With the bands and the rest I'm struggling a bit. I'm attaching exported metadata from ImageJ (btw how could I access it with python?). It seems that it is not bip, since the interleaved is false, is it possible?

Then it looks like 32 float and that is kind of all I am able to get from that (I do not see the bands). One more question, how do we account for binning? Because some of them do have some binning and some don't.

Thanks a lot again.

metadata.xlsx

tboggs commented 3 years ago

What is the exact size of the file?

jankaWIS commented 3 years ago

Shows this: 16 806 348 bytes (16,8 MB on disk)

tboggs commented 3 years ago

For 32-bit floating point values and 2048x2048 pixels, there should only be a single band of data in the image. Unfortunately, there isn't exactly enough space since there are a few bytes extra in the file:

16806348  -  (2048 * 2048 * 4) = 29132

In case the extra bytes are in the tail end of the file, you could try a header like this:

ENVI
samples = 2048
lines = 2048
bands = 1
header offset = 0
file type = ENVI Standard
data type = 4
interleave = bip
byte order = 0

and see if the resulting image looks reasonable. If that doesn't look right (particularly if you see diagonal stripes through a nonsensical image), try changing the header offset parameter to 29132 and see if that fixes it. In both cases, you might also want to try toggling byte order between 0 and 1.

jankaWIS commented 3 years ago

I'm getting a new error now:

---------------------------------------------------------------------------
EnviDataFileNotFoundError                 Traceback (most recent call last)
<ipython-input-21-250f8719b309> in <module>
      1 from spectral import *
----> 2 img = open_image('/path_to_file/Frgrnd.hdr')
      3 #

~/anaconda3/lib/python3.8/site-packages/spectral/spectral.py in open_image(file)
     98     # Try to open it as an ENVI header file.
     99     try:
--> 100         return io.envi.open(pathname)
    101     except io.envi.FileNotAnEnviHeader:
    102         # It isn't an ENVI file so try another file type

~/anaconda3/lib/python3.8/site-packages/spectral/io/envi.py in open(file, image)
    313               'given header file. You can specify the data file by passing ' \
    314               'its name as the optional `image` argument to envi.open.'
--> 315             raise EnviDataFileNotFoundError(msg)
    316     else:
    317         image = find_file_path(image)

EnviDataFileNotFoundError: Unable to determine the ENVI data file name for the given header file. You can specify the data file by passing its name as the optional `image` argument to envi.open.

The same goes for envi.open('/Users/jan/Downloads/Ondra/i1/bricho/48h/Frgrnd.hdr'). I have noted that there is the header if I open it in a textfile (the one I posted before) and when I tried to copy it out, it had 2 463 bytes. The problem is that it continuously merges to the binary part so I have no idea where the end is hence what the size is. Trying 2463 as an offset gave the same error (also trying the other offset or playing with the byte order). The same goes for the tail. Apparently, there is some text but it's intermixed with the bytes so I don't exactly know.

I was wondering, how does FiJi know all this? Because there I do not pass anything and it loads.

tboggs commented 3 years ago

".bip" isn't a recognized image file extension so try to open it explicitly like this:

img = spy.envi.open('/path_to_file/Frgrnd.hdr', image='/path_to_file/Frgrnd.bip')
jankaWIS commented 3 years ago

That seems to do the job, thanks a lot. Or at least now we have a different problem. The image is loaded but does not make sense:

plt.imshow(img[:,:,0])

image

I have tried to play with the params, bil, bip, and bsq give the same image, byte order set to 1 gives nothing. Playing with the header offset changes the image but I'm unsure how to find the correct offset (and do not want to try all). I have tried the 2463 and 29132 and a few others but it's not very effective. Also, fo some of them, like 2465, I need to flip the byte order to have something shown. How can I determine the offset?

tboggs commented 3 years ago

If the file extension is ".bip", you should stick with BIP, though in the case of a single band, it doesn't even matter which interleave you choose. An offset of 2463 seems highly unlikely (I would expect it to be a multiple of 4). For now, I would stick with either 0 or 29132 as the offset.

The image you displayed doesn't look inherently wrong but it appears to be only be showing two colors so maybe the image is just saturated. You should take a look at the histogram of the data to see if there is a large range in the data and possibly some extreme outliers. If you use the spectral.imshow function, you can try clipping the histogram using the "stretch" kwarg to see if that makes a difference. You want to set the stretch limits to where most of the "good" data resides. For example, to apply a linear stretch between the 0.1 and 0.9 percentiles of the data, you could use

spy.imshow(img[:, :, 0], stretch=(0.1, 0.9))

Depending on how many of the pixels lie in the tails of the distribution, you may need to adjust the stretch limits accordingly.