libvips / pyvips

python binding for libvips using cffi
MIT License
649 stars 50 forks source link

tiff2vips read error #516

Closed lyn1874 closed 1 week ago

lyn1874 commented 1 week ago

Hi, many thanks for the awesome project!

I'm working with some large .tif files ranging from 5GB to 16GB. Each file is in the format [3, height, width] with uint8 values. My goal is to convert these .tif files to .svs so I can read them with OpenSlide for a pretrained machine learning model. Here’s the command I’m using:

image = pyvips.Image.new_from_file("xxx.tif", access="sequential") image.tiffsave( "xxx.svs", tile=True, pyramid=True, compression="jpeg", Q=100, tile_width=800, tile_height=800, bigtiff=True )

For the smallest .tif image, I can run this without any problems. However, for the larger ones, I get error message: pyvips.error.Error: unable to call tiffsave TIFFFillStrip: Read error at scanline 4294967295; got 2263871488 bytes, expected 4013289956

I am assuming this error is related to my RAM. Given the situation that I cannot increase my RAM, are there other things I can try to circumvent this issue? Thanks!

jcupitt commented 1 week ago

Hi @lyn1874,

It should be fine, I doubt if memory is the issue. Unless you have VERY little ram (only a few GB).

If you're on Windows, there were some bugs with files over 4gb. Updating to the latest version should fix this.

Does it work at the command-line? Try:

vips copy huge.tif x.tif[tile,pyramid,compression=jpeg,Q=100,tile-width=800,tile-height=800,bigtiff]

Could you run tiffinfo on your source file and post the output here? It's useful for diagnostics.

If you can get me a test file somehow I could try running it too.

jcupitt commented 1 week ago

... I meant to add, it won't write an SVS, it'll write a standard pyramidal TIFF.

SVS is a strange nonstandard format made by Aperio, and not easy to write correctly. Hopefully a standard pyramidal TIFF will work for you.

lyn1874 commented 1 week ago

Hi John, thanks for your fast reply!

I am on MacOS system (I also tried the code on a linux machine), both gave me similar error.

Unfortunately I cannot upload an example image here due to privacy issue. The output for tiffinfo is like this:

=== TIFF directory 0 === TIFF Directory at offset 0x10 (16) Image Width: 61966 Image Length: 64766 Resolution: 1, 1 (unitless) Bits/Sample: 8 Compression Scheme: None Photometric Interpretation: RGB color Samples/Pixel: 3 Rows/Strip: 64766 Planar Configuration: separate image planes ImageDescription: {"shape": [3, 64766, 61966]} Software: tifffile.py

I tried to run the vips copy code, but I got similar error: (vips:73513): VIPS-WARNING **: 16:58:58.236: error in tile 0 x 0 xxx.tif: read error system error: Invalid argument TIFFFillStrip: Read error at scanline 4294967295; got 116391935 bytes, expected 4013289956 tiff2vips: read error

My vips version is vips-8.16.0 and my pyvips on the remote linux machine is 2.2.1.

Thanks for your help!

jcupitt commented 1 week ago

Read error at scanline 4294967295 sounds very suspicious, your file only has 64766 lines. It looks rather like 2 ** 32 - 1, which suggest an integer overflow somewhere.

You have:

Rows/Strip: 64766

That's a CRAZY value, your entire image has been saved as a single uncompressed strip. What software wrote this file? Could you get it to use a not-crazy value?

edit a not-crazy value would be something like 16

jcupitt commented 1 week ago

Uncompressed strip TIFFs with all the data in a single strip are (in effect) memory dumps and you can read them without libtiff if you can figure out where the data starts.

So a solution might be to find the value of the first pixel with eg.:

$ vips getpoint k2.tif 0 0 
11 1 0 

So the first pixel in this file has RGB 11, 1, 0. You can find its location in the file, then load the image with rawload. Suppose that pixel is at byte offset 311 into the file, then:

vips rawload k2.tif x.tif[tile,pyramid,compression=jpeg,Q=100,tile-width=800,tile-height=800,bigtiff] 61966 64766 3 --offset 311 --interpretation sRGB

Should work.

lyn1874 commented 1 week ago

Read error at scanline 4294967295 sounds very suspicious, your file only has 64766 lines. It looks rather like 2 ** 32 - 1, which suggest an integer overflow somewhere.

You have:

Rows/Strip: 64766

That's a CRAZY value, your entire image has been saved as a single uncompressed strip. What software wrote this file? Could you get it to use a not-crazy value?

edit a not-crazy value would be something like 16

Hi, thanks a lot for this information. I changed the rows/strip to 512 and now I don't have the error message anymore. Again, many thanks!

jcupitt commented 1 week ago

Ah, great! Saving as a tiled TIFF is even better, if you can do that.