Closed kaczmarj closed 1 year ago
Hi @kaczmarj
Happy to hear that tiffslide is useful to you!
Your benchmark is not testing a really useful scenario. When you run timeit with the same region, you hit openslide's and tiffslide's internal cache after the first call and in this scenario, you're effectively measuring (on the tiffslide side) how long PIL takes to convert a numpy array.
Benchmarking this stuff is not really simple, since you have to be aware of internal caches of your tools, and also of other non-obvious caches, like your operating system caching disk access, etc.
As mentioned in the readme, I recommend running the benchmark below, which tries to test accessing multiple different tiles on files, to simulate a more realistic use case.
OPENSLIDE_TESTDATA_DIR=/path/to/testdata/ python docs/generate_benchmark_plots.py
you can easily modify the files used to run the benchmark by changing: https://github.com/bayer-science-for-a-better-life/tiffslide/blob/63c86e9d4f168072bb75784e720d0d0acdacee0f/tiffslide/tests/test_benchmark.py#L15-L21
I'd be interested to see your results on the tcga files!
Cheers, Andreas :smiley:
let me add the TCGA file to the benchmark and test. thanks for the quick reply @ap-- !
hi @ap-- I added an SVS files from TCGA to the pytests and generated the plots. i am seeing a 4x in runtime for tiffslide vs openslide. it's interesting that this does not have for CMU-2.svs... do you have any thoughts on why this could be? i can test other SVS slides from TCGA as well if you think that would be useful.
my only hypothesis at this point is that this is related to the image size. the tcga svs is 1.6 gb whereas the CMU SVS is 542 mb.
in test_benchmark.py
, i set the FILES
dictionary to
FILES = {
"svs": "Aperio/CMU-2.svs",
"generic": "Generic-TIFF/CMU-1.tiff",
"tcga-svs": "TCGA-SVS/TCGA-3C-AALI-01Z-00-DX1.F6E9A5DF-D8FB-45CF-B4BD-C6B76294C291.svs",
}
i tested two different tcga slides of different sizes but it seems that openslide is much faster than tifffile for both of these images. my hypothesis of image size being related to the speed does not seem to be correct.
by the way, i am on a debian 12 linux system with python 3.10.12 and glibc version 2.36.
$ uname -a
Linux dash 6.1.0-9-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.27-1 (2023-05-08) x86_64 GNU/Linux
Is there a difference in the compression used by these files?
yes there is a difference in compression. i used tiffinfo
(from libtiff) to get this info. CMU-2.svs
uses JPEG compression whereas the TCGA svs file uses compression scheme 33005
(which apparently is a specific type of JPEG 2000). OpenSlide has some notes about this compression scheme (from https://openslide.org/formats/aperio/):
JPEG 2000 (compression types 33003 or 33005)
Some Aperio files use compression type 33003 or 33005. Images using this compression need to be decoded as a JPEG 2000 codestream. For 33003: YCbCr format, possibly with a chroma subsampling of 4:2:2. For 33005: RGB format. Note that the TIFF file may not encode the colorspace or subsampling parameters in the PhotometricInterpretation field, nor the YCbCrSubsampling field, even though the TIFF standard seems to require this. The correct subsampling can be found in the JPEG 2000 codestream.
here are the tiff details for CMU-2.svs and TCGA-05-4395-01Z-00-DX1.20205276-ca16-46b2-914a-fe5e576a5cf9.svs. please click on the arrows to expand the output.
in the TCGA SVS, TIFF directory 1 uses JPEG compression. perhaps by forcing a read from directory 1 we can test whether difference in compression is the culprit. if we read from directory 1 and tiffslide is still slower than openslide, there could be something in addition to compression differences. but if the speed matches/exceeds openslide, then the compression is the cause.
but directory 1 of the TCGA SVS only has size 1024x568 WxH. perhaps that's the thumbnail. it does not come up as an image level in openslide or tiffslide.
Hmm, my tests indicate both images seem to store uncompressed tiles...
# pip install pado
# pip install aiohttp requests s3fs
import json
from pprint import pprint
from pado.images.ids import ImageId
from pado.images.providers import ImageProvider
from pado.io.files import urlpathlike_to_fsspec
from tiffslide import TiffSlide
import matplotlib.pyplot as plt
ip = ImageProvider.from_parquet(
"zip:///tcga.image.parquet::https://github.com/ap--/pado-tcga/releases/download/v0.0.1/pado-tcga-dataset.zip"
)
image_ids = [
ImageId(
'2aa283f3-732c-4879-8d37-1fec3ccf5bdc',
'TCGA-05-4395-01Z-00-DX1.20205276-ca16-46b2-914a-fe5e576a5cf9.svs',
site='tcga',
),
ImageId(
'd46167af-6c29-49c7-95cf-3a801181aca4',
'TCGA-3C-AALI-01Z-00-DX1.F6E9A5DF-D8FB-45CF-B4BD-C6B76294C291.svs',
site='tcga',
),
]
for iid in image_ids:
img = ip[iid]
of = urlpathlike_to_fsspec(img.urlpath)
# check via tiffslide
ts = TiffSlide(of)
print(iid)
pprint(json.loads(ts.zarr_group.store["0/.zarray"]))
fig = plt.figure()
w, h = ts.dimensions
plt.imshow(ts.read_region((w//2, h//2), 0, (1000, 1000), as_array=True))
plt.show()
output:
ImageId('2aa283f3-732c-4879-8d37-1fec3ccf5bdc', 'TCGA-05-4395-01Z-00-DX1.20205276-ca16-46b2-914a-fe5e576a5cf9.svs', site='tcga')
{'chunks': [256, 256, 3],
'compressor': None,
'dtype': '|u1',
'fill_value': 0,
'filters': None,
'order': 'C',
'shape': [26880, 48384, 3],
'zarr_format': 2}
ImageId('d46167af-6c29-49c7-95cf-3a801181aca4', 'TCGA-3C-AALI-01Z-00-DX1.F6E9A5DF-D8FB-45CF-B4BD-C6B76294C291.svs', site='tcga')
{'chunks': [256, 256, 3],
'compressor': None,
'dtype': '|u1',
'fill_value': 0,
'filters': None,
'order': 'C',
'shape': [74432, 101184, 3],
'zarr_format': 2}
If that turns out to be true, it would mean that there's just too much python overhead in reading uncompressed tiles from disk via zarr. We'd need some profiling to be sure about that and a potential solution would be to try if we can just shortcut for local uncompressed files. I have a test implementation of a memory mapped zarr store for local files lying around somewhere. I'll try to find it. Will report back in the coming days.
Cheers, Andreas :smiley:
thanks @ap-- that's very helpful. i also see that tiffslide reports no compression:
code:
import json, tiffslide
tslide = tiffslide.TiffSlide("TCGA-05-4395-01Z-00-DX1.20205276-ca16-46b2-914a-fe5e576a5cf9.svs")
json.loads(tslide.zarr_group.store["0/.zarray"])
output:
{'chunks': [256, 256, 3],
'compressor': None,
'dtype': '|u1',
'fill_value': 0,
'filters': None,
'order': 'C',
'shape': [26880, 48384, 3],
'zarr_format': 2}
but exiftool
also shows that JPEG2000 compression is used.
$ git clone https://github.com/exiftool/exiftool.git
$ cd exiftool
$ ./exiftool ../TCGA-05-4395-01Z-00-DX1.20205276-ca16-46b2-914a-fe5e576a5cf9.svs
ExifTool Version Number : 12.64
File Name : TCGA-05-4395-01Z-00-DX1.20205276-ca16-46b2-914a-fe5e576a5cf9.svs
Directory : ..
File Size : 191 MB
File Modification Date/Time : 2023:07:08 09:34:00-04:00
File Access Date/Time : 2023:07:08 09:37:25-04:00
File Inode Change Date/Time : 2023:07:08 09:37:08-04:00
File Permissions : -rw-r--r--
File Type : TIFF
File Type Extension : tif
MIME Type : image/tiff
Exif Byte Order : Little-endian (Intel, II)
Image Width : 48384
Image Height : 26880
Bits Per Sample : 8 8 8
Compression : Aperio JPEG 2000 RGB
Photometric Interpretation : RGB
Image Description : Aperio Image Library v11.0.37..48384x26880 (256x256) J2K/KDU Q=70;Mirax Digital Slide|AppMag = 20|MPP = 0.23250
Samples Per Pixel : 3
Planar Configuration : Chunky
Strip Offsets : (Binary data 359 bytes, use -b option to extract)
Rows Per Strip : 16
Strip Byte Counts : (Binary data 173 bytes, use -b option to extract)
JPEG Tables : (Binary data 289 bytes, use -b option to extract)
Y Cb Cr Sub Sampling : YCbCr4:2:0 (2 2)
Subfile Type : Full-resolution image
Tile Width : 256
Tile Length : 256
Tile Offsets : (Binary data 839 bytes, use -b option to extract)
Tile Byte Counts : (Binary data 464 bytes, use -b option to extract)
Image Depth : 1
Page Count : 4
Image Size : 48384x26880
Megapixels : 1300.6
i also see that tiffslide reports no compression
That's because tifffile.ZarrTiffStore
is just a thin wrapper around a tifffile.TiffFile
instance. The store transparently handles all the file access, decompression, predictors, unpacking, padding, etc. Zarr/numcodecs would not be able to handle all the cases found in TIFF.
it seems that openslide is much faster than tifffile
On my aging Windows system, the difference is much less:
I suspect the difference could be due to differences in JPEG2000 decoders. For example, imagecodecs does not enable OpenJPEG multi-threading by default. I'll check if that's significant...
I am surprised that tiffslide/tifffile/zarr perform competitively. There are many, many layers of pure Python code...
imagecodecs does not enable OpenJPEG multi-threading by default. I'll check if that's significant...
It turns out that enabling multi-threading makes things significantly worse :(
Maybe some basic profiling could help discerning if the time spent on other things is dominant or if this is really a case of differences between how imagecodecs and openslide wrap around OpenJPEG to decode JP2K.
I am surprised that tiffslide/tifffile/zarr perform competitively. There are many, many layers of pure Python code...
i am also surprised and impressed that this implementation performs competitively!
i realize my words might have unintentionally come across as negative or offensive towards tiffslide/tifffile and i want to be clear that i do not imply any negativity here. i hold tremendous respect for tiffslide and tifffile (and all of your work @cgohlke !).
It turns out that enabling multi-threading makes things significantly worse :(
that is unfortunate 😢
Maybe some basic profiling could help
i ran python's cProfile on the read_region
method in tiffslide and openslide. this doesn't capture the C bits in openslide unfortunately (and i don't know how to do that). when profiling TiffSlide.read_region
, most of the time was spent in the function imagecodecs._jpeg2k.jpeg2k_decode
. the results are below. i truncated the profiling results of tiffslide profiling to ~25 function calls. i also replaced the path to my python installation to 'path/to' to make the lines shorter.
code:
import cProfile, pstats, tiffslide
tslide = tiffslide.TiffSlide("TCGA-05-4395-01Z-00-DX1.20205276-ca16-46b2-914a-fe5e576a5cf9.svs")
with cProfile.Profile() as pr:
tslide.read_region(location=(14_000, 12_000), level=0, size=(512, 512))
stats = pstats.Stats(pr).sort_stats("tottime")
stats.print_stats()
output:
6732 function calls (6304 primitive calls) in 0.044 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
18 0.039 0.002 0.039 0.002 {imagecodecs._jpeg2k.jpeg2k_decode}
152/12 0.001 0.000 0.001 0.000 {built-in method _abc._abc_subclasscheck}
4 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/tifffile/tifffile.py:11745(__init__)
2 0.000 0.000 0.000 0.000 {built-in method _imp.create_dynamic}
2/1 0.000 0.000 0.001 0.001 {built-in method _imp.exec_dynamic}
18 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/tifffile/tifffile.py:12944(_indices)
9 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/zarr/core.py:1862(_process_chunk)
1 0.000 0.000 0.000 0.000 {built-in method PIL._imaging.fill}
3 0.000 0.000 0.001 0.000 path/to/python3.11/site-packages/tifffile/tifffile.py:7770(__init__)
43 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/tifffile/tifffile.py:10631(fromfile)
19 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/tifffile/tifffile.py:12897(_parse_key)
18 0.000 0.000 0.040 0.002 path/to/python3.11/site-packages/tifffile/tifffile.py:12836(_getitem)
45 0.000 0.000 0.000 0.000 {method 'read' of '_io.BufferedReader' objects}
301/230 0.000 0.000 0.000 0.000 path/to/python3.11/json/encoder.py:334(_iterencode_dict)
24/6 0.000 0.000 0.004 0.001 path/to/python3.11/functools.py:981(__get__)
1 0.000 0.000 0.000 0.000 {method 'decode' of 'ImagingDecoder' objects}
1 0.000 0.000 0.040 0.040 path/to/python3.11/site-packages/zarr/core.py:1257(_get_selection)
748 0.000 0.000 0.001 0.000 {built-in method builtins.isinstance}
149 0.000 0.000 0.000 0.000 {built-in method _struct.unpack}
8 0.000 0.000 0.000 0.000 path/to/python3.11/enum.py:241(__set_name__)
43 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/tifffile/tifffile.py:10793(_process_value)
18 0.000 0.000 0.039 0.002 path/to/python3.11/site-packages/tifffile/tifffile.py:8574(decode_image)
1 0.000 0.000 0.001 0.001 path/to/python3.11/site-packages/tifffile/tifffile.py:12332(__init__)
135/127 0.000 0.000 0.000 0.000 {built-in method builtins.getattr}
1 0.000 0.000 0.001 0.001 path/to/python3.11/site-packages/tifffile/tifffile.py:7288(_load)
230 0.000 0.000 0.000 0.000 path/to/python3.11/json/encoder.py:414(_iterencode)
5 0.000 0.000 0.000 0.000 path/to/python3.11/typing.py:1896(_get_protocol_attrs)
[truncated]
code:
import cProfile, pstats, openslide
oslide = openslide.OpenSlide("TCGA-05-4395-01Z-00-DX1.20205276-ca16-46b2-914a-fe5e576a5cf9.svs")
with cProfile.Profile() as pr:
oslide.read_region(location=(14_000, 12_000), level=0, size=(512, 512))
stats = pstats.Stats(pr).sort_stats("tottime")
stats.print_stats()
output:
30 function calls in 0.026 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.026 0.026 0.026 0.026 path/to/python3.11/site-packages/openslide/lowlevel.py:300(read_region)
1 0.000 0.000 0.000 0.000 {built-in method openslide._convert.argb2rgba}
1 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/openslide/lowlevel.py:186(_load_image)
1 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/openslide/lowlevel.py:222(_check_error)
1 0.000 0.000 0.000 0.000 {built-in method PIL._imaging.fill}
2 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/PIL/Image.py:505(_new)
1 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/PIL/Image.py:2955(frombuffer)
1 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/PIL/Image.py:2878(new)
2 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/PIL/Image.py:2857(_check_size)
2 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/openslide/lowlevel.py:129(from_param)
1 0.000 0.000 0.000 0.000 {built-in method PIL._imaging.map_buffer}
3 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/PIL/Image.py:481(__init__)
1 0.000 0.000 0.026 0.026 path/to/python3.11/site-packages/openslide/__init__.py:226(read_region)
1 0.000 0.000 0.000 0.000 path/to/python3.11/cProfile.py:118(__exit__)
2 0.000 0.000 0.000 0.000 path/to/python3.11/site-packages/openslide/lowlevel.py:214(_check_string)
3 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
3 0.000 0.000 0.000 0.000 {built-in method builtins.len}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {method 'copy' of 'dict' objects}
i realize my words might have unintentionally come across as negative or offensive towards tiffslide/tifffile
Oh no. I did not understand it like that. I am interested in learning about such issues.
most of the time was spent in the function imagecodecs._jpeg2k.jpeg2k_decode
That's good to know. The tiles are relatively small (256x256) for JPEG 2000. Compared to an implementation in all C, such as oopenslide, for decoding a single tile there might be overheads from 1. calling the C function from Python, 2. creating a new instance of the OpenJPEG decoder in every call, 3. releasing the GIL, and 4. creating and copying image data into a numpy array. I'll try to enable Cython profiling https://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html and see...
there might be overheads from 1. calling the C function from Python, 2. creating a new instance of the OpenJPEG decoder in every call, 3. releasing the GIL, and 4. creating and copying image data into a numpy array.
None of these seem significant in this case. Almost all the time is spent in OpenJPEG's opj_decode
function. I rebuilt OpenJPEG with AVX2 extensions, but that made no difference on my system either :(
imagecodecs._jpeg2k.jpeg2k_decode
is run twice as many times as openslide's jp2k decoder and this could potentially explain the longer runtime.
when i profiled tiffslide.TiffSlide.read_region
, i noticed that imagecodecs._jpeg2k.jpeg2k_decode
was being called multiple times. this makes sense as it's decoding multiple tiles. i sought to measure the number of times openslide's jpeg2k decoder was run. to do this, i cloned openslide and added a print statement to line 59 of openslide-decode-jp2k.c
. it seems that that function is run wither every call to the openjpeg decoder.
git clone https://github.com/openslide/openslide
git checkout v3.4.1
# add print statement to line 59
sed -i '59i printf("Running unpack_argb\\n");' src/openslide-decode-jp2k.c
# build openslide
autoreconf -i
./configure
make
after building openslide, i copied the resulting library libopenslide.so.0.4.1
into my conda environment containing tiffslide and openslide (replacing the original openslide downloaded from conda-forge).
import openslide
oslide = openslide.OpenSlide("TCGA-3C-AALI-01Z-00-DX1.F6E9A5DF-D8FB-45CF-B4BD-C6B76294C291.svs")
oslide.read_region((0, 0), 0, (128, 128))
# prints:
# Running unpack_argb
oslide.read_region((14_000, 12_000), 0, (512, 512))
# prints:
# Running unpack_argb
# Running unpack_argb
# Running unpack_argb
# Running unpack_argb
# Running unpack_argb
# Running unpack_argb
# Running unpack_argb
# Running unpack_argb
# Running unpack_argb
interestingly, if i profile TiffSlide.read_region
to count the number of times imagecodecs._jpeg2k.jpeg2k_decode
is called, then it is 2 in the first case and 18 in the second case. openslide called openjpeg decoder 1 time and 9 times for the same regions.
import cProfile, pstats, tiffslide
tslide = tiffslide.TiffSlide("TCGA-3C-AALI-01Z-00-DX1.F6E9A5DF-D8FB-45CF-B4BD-C6B76294C291.svs")
with cProfile.Profile() as pr:
tslide.read_region(location=(0, 0), level=0, size=(128, 128))
stats = pstats.Stats(pr).sort_stats("tottime")
stats.print_stats()
# ncalls tottime percall cumtime percall filename:lineno(function)
# 2 0.005 0.002 0.005 0.002 {imagecodecs._jpeg2k.jpeg2k_decode}
# [truncated]
with cProfile.Profile() as pr:
tslide.read_region(location=(14_000, 12_000), level=0, size=(512, 512))
stats = pstats.Stats(pr).sort_stats("tottime")
stats.print_stats()
# ncalls tottime percall cumtime percall filename:lineno(function)
# 18 0.041 0.002 0.041 0.002 {imagecodecs._jpeg2k.jpeg2k_decode}
# [truncated]
Good catch. This code requests the following keys from the Zarr store:
from tifffile import imread
im = imread(
'TCGA-3C-AALI-01Z-00-DX1.F6E9A5DF-D8FB-45CF-B4BD-C6B76294C291.svs',
selection=(slice(14_000, 14_512), slice(12_000, 12_512)),
)
0/54.46.0
0/54.46.0
0/54.47.0
0/54.47.0
0/54.48.0
0/54.48.0
0/55.46.0
0/55.46.0
0/55.47.0
0/55.47.0
0/55.48.0
0/55.48.0
0/56.46.0
0/56.46.0
0/56.47.0
0/56.47.0
0/56.48.0
0/56.48.0
The issue is that Zarr's KVStore
, which is used to wrap ZarrTiffStore
, does not have a __contains__
method such that key in store
is routed through __getitem__
, which triggers decoding...
I think it's a bug in Zarr that is easy to fix.
With the fix I get this:
wow that's fantastic! thanks @cgohlke. should i open an issue in the zarr-python github repo?
should i open an issue in the zarr-python github repo?
I'm on it.
Ha! That's great :smiley: I guess I'll have to update the benchmarks in the readme once a new version of zarr is released :smile:
Thank's everyone!
i am seeing a 4x in runtime for tiffslide vs openslide
The Zarr issue accounts for a ~2x difference. Where does the other 2x come from? I don't see that on Windows, where the OS cache is not reset. Could also be a difference in how OpenJPEG is compiled. What versions of tifffile and imagecodecs were used and how were they installed?
Where does the other 2x come from?
i was probably mistaken earlier when i said 4x, though i still do see that tiffslide is a bit slower than openslide when installed via pip. when installed via conda, tiffslide is faster!
What versions of tifffile and imagecodecs were used and how were they installed?
i tested installations via pip and via conda/mamba. i include the versions of the packages in each environment below (click on the arrows to show the versions). in both cases, tifffile==2023.7.4
but in the pip environment, imagecodecs==2023.7.4
whereas in conda imagecodecs==2023.1.23
is used (i could not install a newer version). i will re-run this using the same versions in all environments and will update.
i patched zarr.KVStore
in tiffslide's __init__.py
file as follows:
from zarr.storage import KVStore
def _zarr_kvstore___contains__(self, key):
return key in self._mutable_mapping
KVStore.__contains__ = _zarr_kvstore___contains__
i also used test data from openslide test data and TCGA:
images/
├── Aperio
│ └── CMU-2.svs
└── TCGA-SVS
└── TCGA-3C-AALI-01Z-00-DX1.F6E9A5DF-D8FB-45CF-B4BD-C6B76294C291.svs
code:
sudo apt install libopenslide0 # installs libopenslide0/stable,now 3.4.1+dfsg-6+b1 amd64
git clone https://github.com/bayer-science-for-a-better-life/tiffslide
cd tiffslide
~/mambaforge/bin/python3.10 -m venv venv
source ./venv/bin/activate
python -m pip install -U pip setuptools wheel
python -m pip install -e .[dev] matplotlib pandas openslide-python pytest-benchmark
OPENSLIDE_TESTDATA_DIR=images/ python docs/generate_benchmark_plots.py
on my debian bookworm machine, libopenslide is linked to libopenjp2.so.7 (pulled as a dependency from https://packages.debian.org/bookworm/libopenjp2-7).
results:
code:
git clone https://github.com/bayer-science-for-a-better-life/tiffslide
cd tiffslide
mamba env create -f environment.devenv.yml # from tiffslide's repo
mamba activate tiffslide
mamba install openslide openslide-python matplotlib pandas
OPENSLIDE_TESTDATA_DIR=images/ python docs/generate_benchmark_plots.py
results:
the difference comes down to imagecodecs from conda-forge and imagecodecs from pypi. using the one from pypi, tiffslide is slower then openslide on the TCGA SVS file i am debugging with.
in my previous test, the conda/mamba environment had the best speeds for tiffslide. in that conda envirnoment, i pip installed imagecodecs==2023.1.23
and then tiffslide became almost 2x slower (~2.5 ms to ~4.9 ms).
aha! the culprit is the different libopenjp2.so.2.5.0
that is pulled in when using pip and conda. to test this, i first installed all tiffslide dependencies with mamba/conda (with imagecodecs==2023.1.23
). using that, tiffslide was faster than openslide for tcga-svs. then i pip installed imagecodecs==2023.1.23
, and tiffslide became slower than openslide for tcga-svs. finally, i copied the file libopenjp2.so.2.5.0
that was downloaded from conda-forge into the directory
~/mambaforge/envs/tiffslide/lib/python3.11/site-packages/imagecodecs.libs/
and i essentially overwrote the previous version which was named libopenjp2-fc287c52.so.2.5.0
. using the openjpeg from conda-forge, tiffslide was faster than openslide.
i am not sure how openjpeg is pulled into the imagecodecs wheel during a build, but i presume openpjeg is being built differently than the conda-forge version. though looking at https://github.com/conda-forge/openjpeg-feedstock/blob/main/recipe/build.sh, there don't seem to be any special build options enabled for the conda-forge version.
building openjpeg with -DCMAKE_BUILD_TYPE=Release
solves the problem. i will submit a pull request to https://github.com/Czaki/imagecodecs_build to add this option.
the change should be made in these lines: https://github.com/Czaki/imagecodecs_build/blob/c7abf4b7c91746c30a754e5d3367f6347262e049/build_utils/build_libraries.sh#L361-L364
when openjpeg is not compiled in release mode, it looks like ffast-math is not enabled (see here):
# Do not use ffast-math for all build, it would produce incorrect results, only set for release:
set(OPENJPEG_LIBRARY_COMPILE_OPTIONS ${OPENJPEG_LIBRARY_COMPILE_OPTIONS} "$<$<CONFIG:Release>:-ffast-math>")
set(OPENJP2_COMPILE_OPTIONS ${OPENJP2_COMPILE_OPTIONS} "$<$<CONFIG:Release>:-ffast-math>" -Wall -Wextra -Wconversion -Wunused-parameter -Wdeclaration-after-statement -Werror=declaration-after-statement)
enabling -DCMAKE_BUILD_TYPE=Release
in the openjpeg build causes imagecodecs tests to fail... :(
Never mind the failures. That repository is out of sync. I build the libraries locally in Docker these days and then build&test the wheels on Azure/GHA...
New version of tiffslide with a fix is on its way to pypi, and then later today to conda.
Thanks again everyone for the fun debugging session :smiley:
building openjpeg with -DCMAKE_BUILD_TYPE=Release solves the problem.
Thank you for finding this. Would you mind trying again with imagecodecs 2023.7.10?
Would you mind trying again with imagecodecs 2023.7.10?
it works! here are the benchmark results on my machine with the most recent tiffslide (8bea5a4c8e1429071ade6d4c40169ce153786d19), tifffile==2023.7.10
, and imagecodecs==2023.7.10
.
what a triumph!!!
tiffslide==2.2.0
has the fix. (I just added two more commits to update the benchmark stuff)
what a triumph!!!
Thanks again for reporting and investigating ❤️
hello, thanks for developing this fantastic package! i am working on porting one of my projects from openslide to tiffslide (very easy thanks to mirrored API :smile:). however i found that tiffslide is much slower than openslide when reading patches from an SVS file in The Cancer Genome Atlas (TCGA).
i created a jupyter notebook to benchmark this here https://gist.github.com/kaczmarj/41c351be6f52aa6a553cc12ba98a9103. this notebook runs a simple benchmarking function on a TCGA BRCA slide and a TIFF and SVS file from openslide test data.
using the slide TCGA-3C-AALI-01Z-00-DX1.F6E9A5DF-D8FB-45CF-B4BD-C6B76294C291.svs (from https://portal.gdc.cancer.gov/files/d46167af-6c29-49c7-95cf-3a801181aca4), i got the following results. tiffslide takes >10x longer to read patches than openslide.
i did not see the same behavior when evaluating CMU-1.tiff and CMU-1.svs from openslide test data, so i don't suspect disk caching to be the culprit.