quintusdias / glymur

Python interface to OpenJPEG library for reading and writing JPEG 2000 images.
MIT License
61 stars 25 forks source link

Improve performance of creating `Jp2k` objects #639

Open dstansby opened 7 months ago

dstansby commented 7 months ago

I often find myself wanting to create > 1000 Jp2k objects, which will then be distributed across multiple cores where the image data is actually read in. This is relatively fast on my local hard drive, but on an external drive it gets a factor of ~10 slower. Doing some quick profiling of creating 1275 Jp2k objects stored on an external SSD reveals performance could be improved by making sure that the Jp2k constructor only opens the file once; currently it opens the file twice. These are total times in seconds spent on each line of code:

├─ 4.089 Jp2k.__init__  ../../glymur/glymur/jp2k.py:189
│  ├─ 4.030 Jp2k.parse  ../../glymur/glymur/jp2k.py:647
│  │  ├─ 2.098 PosixPath.open  pathlib.py:1036
│  │  │  ├─ 2.097 open  <built-in>
│  │  │  └─ 0.001 [self]  pathlib.py
│  │  ├─ 1.874 Jp2k._validate  ../../glymur/glymur/jp2k.py:697
│  │  │  ├─ 1.866 Jp2k.codestream  ../../glymur/glymur/jp2k.py:581
│  │  │  │  ├─ 1.862 Jp2k.get_codestream  ../../glymur/glymur/jp2k.py:1824
│  │  │  │  │  ├─ 1.694 PosixPath.open  pathlib.py:1036
<rest of trace truncated>