python-pillow / Pillow

Python Imaging Library (Fork)
https://python-pillow.org
Other
12.34k stars 2.24k forks source link

FitsImagePlugin does not support RICE_1 compression #7771

Open JulianNotFound opened 10 months ago

JulianNotFound commented 10 months ago

What did you do?

Tried using pillow for opening/handling a .fits file. The header in my fits file has two parts and the its structure is like this:

BITPIX  =                   16 / number of bits per data pixel                  
NAXIS   =                    0 / number of data axes                            
EXTEND  =                    T / FITS dataset may contain extensions            
COMMENT   FITS (Flexible Image Transport System) format is defined in 'AstronomyCOMMENT   and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H 
END

XTENSION= 'BINTABLE'           / binary table extension                         
BITPIX  =                    8 / 8-bit bytes                                    
NAXIS   =                    2 / 2-dimensional binary table   
...
END                  

The fits file that I use comes from satellite observation data. Here's an example fits file that comes from the same satellite with the same header structure.

I'm not sure whether this kind of header agrees the fits format standard, so I checked the fit format document. In section 3.3.1, it says

The last header block must contain the END keyword, which marks the logical end of the header.

It means that the fits file I use agrees with the fits file standard.

I have just checked PR #5405 , it seems that when reading the fits file from disk, the code break at the first END.

What did you expect to happen?

Not receiving a "ValueError: No image datar" while using Image.open. Expected to load the fits image data.

What actually happened?

The Image.open code returns: ValueError: No image data

What are your OS, Python and Pillow versions?

JulianNotFound commented 10 months ago

I was trying to fix the bug in the source code, and I found a new bug.

My fits file is a compressed fits image (XTENSION= 'BINTABLE', TTYPE1 = 'COMPRESSED_DATA', and has keys like ZIMAGE, ZCMPTYPE, ...), but pillow seems to ignore this situation.

The shape of my origin image is 4096x4096, but pillow only read the compressed image with a shape of 4096x8.

radarhere commented 9 months ago

I would describe the situation here by saying that FitsImagePlugin doesn't currently support Binary Table extensions.

Pillow is reading the primary array header, seeing that there are 0 data axes, and then raises an error, because, well, we can't do much with a zero-dimensional image.

When you tried to read the second header, Pillow then found NAXIS1 = 8 in your file and concluded that the image is 8px wide. You are instead interested in what your file refers to as the original image, which has ZNAXIS1 = 4096. It seems to me like there are two images in your file, a 8-bit compressed one and an 16-bit original image.

How about this suggestion? What if Pillow reads your image, initially says that it is 0px wide by 0px high, and then, the user calls get_child_images(), which returns an array of the compressed image and the uncompressed image?

JulianNotFound commented 9 months ago

How about this suggestion? What if Pillow reads your image, initially says that it is 0px wide by 0px high, and then, the user calls get_child_images(), which returns an array of the compressed image and the uncompressed image?

In my opinion, the compressed image and the uncompressed image are the same image. The creator of the fits file only saved the compressed image and added those essential origin image parameters to the header.

Besides, I think the 0x0 image and the following compressed image are on the same level instead of the father-child relationship. I think skipping the 0-dimensional image and returning the following image would be better. If a fits format file does have two parts of images(the first part is not 0-dimensional), the Image.open code can return a list with two Image Objects. (Actually, I don't think it will happen.)

Here are the solutions from other Python Libraries. Astropy will return a list with two elements and Sunpy will return a single object.

radarhere commented 9 months ago

Looking at the spec, I found

The following describes the process for compressing n−dimensional FITS images and storing the resulting byte stream in a variable-length column in a FITS binary table, and for preserving the image header keywords in the table header. The general principle is to first divide the n−dimensional image into a rectangular grid of subimages or “tiles.” Each tile is then compressed as a block of data, and the resulting compressed byte stream is stored in a row of a variable-length column in a FITS binary table (see Sect. 7.3). By dividing the image into tiles it is possible to extract and decompress subsections of the image without having to decompress the whole image.

So, it seems like what I was reading in the header was not a second image, but instead an option to load the image piece by piece. In essence, you're right, there's only one image.

Two questions 1) Out of interest, if you're aware of other Python libraries that can decode this image successfully, why are you pursuing Pillow? 2) This might seem strange, but do you have a version of this image in another format that you could attach here, for comparison purposes?

JulianNotFound commented 9 months ago
  1. Out of interest, if you're aware of other Python libraries that can decode this image successfully, why are you pursuing Pillow?

I am trying to do a machine-learning image annotation work. Most of the annotation tools do not support the "fits" format and it is very difficult to load a custom image format. I tried to use the CVAT and it loads the image with Pillow. That's the reason why I want to use Pillow to load fits format files.

  1. This might seem strange, but do you have a version of this image in another format that you could attach here, for comparison purposes?

The origin image is not very clear to view so I often use the histogram equalization algorithm before viewing the image. This is the origin image: https://github.com/JulianNotFound/fits/blob/main/origin.png This is the histogram equalization image:https://github.com/JulianNotFound/fits/blob/main/histogram_equalization.png Here's the histogram equalization code that I use: (The input is a 2-dimension numpy array)

def image_histogram_equalization(image, number_bins=256):
    # get image histogram
    image_histogram, bins = np.histogram(image.flatten(), number_bins, density=True)
    cdf = image_histogram.cumsum()  # cumulative distribution function
    cdf = 255 * cdf / cdf[-1]  # normalize

    # use linear interpolation of cdf to find new pixel values
    image_equalized = np.interp(image.flatten(), bins[:-1], cdf)

    return image_equalized.reshape(image.shape).astype(np.uint8)
radarhere commented 8 months ago

Your image has RICE_1 compression. I've been having trouble figuring that out, so in the meantime, I've created #7894 to add support for something other than uncompressed data at least - GZIP_1 compression.