openpreserve / jpylyzer

JP2 (JPEG 2000 Part 1) validator and properties extractor. Jpylyzer was specifically created to check that a JP2 file really conforms to the format's specifications. Additionally jpylyzer is able to extract technical characteristics.
http://jpylyzer.openpreservation.org/
Other
69 stars 28 forks source link

Use mmap instead of reading the image file #73

Closed stweil closed 8 years ago

stweil commented 8 years ago

This patch tries to address issue #32. It is only a proof of concept, not meant to be pulled. Comments welcome.

Regards Stefan

This is experimental code which will only work on Linux because Windows uses different parameters for mmap.

Instead of reading the whole image file into memory, mmap is used to map the file into memory. Only those parts of the file which are accessed will be read from disk.

In theory, this should allow very large files on 64 bit operating systems. 32 bit operating systems are still limited to the mappable memory which is less than 4 GiB, typically even less than 2 GiB.

Signed-off-by: Stefan Weil sw@weilnetz.de

bitsgalore commented 8 years ago

This looks really useful (and something I never got round to investigating myself), again will get back to you once I have a working machine for some proper testing again.

Meanwhile I had a quick look had the mmap docs here:

https://docs.python.org/2/library/mmap.html

At first glance, making this work on both Linux and Windows looks pretty doable (though I think there are some subtle differences between Py 2.7.x and Py 3.x - anyway it should be possible to handle that as well).

bitsgalore commented 8 years ago

So I checked out your patch to a separate branch, and then made the call to mmap platform-dependent (or so I hope, as I don't have access to my Windows machine right now). In any case this seems to work under Linux with both Py 2.7 and Py 3.

To do later:

Will do both when I'm back in the office with access to my Windows machine. Once that all works, I'll roll it into a shiny new 1.15 release.