mhe / pynrrd

Simple pure-python module for reading and writing nrrd files.
https://pynrrd.readthedocs.io/
MIT License
116 stars 51 forks source link

The nrrd.read() function returns a read-only numpy array if the input file is compressed #99

Closed lguyot closed 5 years ago

lguyot commented 5 years ago

Imagine that you need to load into memory a very large image file which has for instance a gzip encoding.

Using the following snippet

import nrrd
img, options = nrrd.read('very_large_image_file.nrrd') # compressed with gzip encoding
print(img.flags)

you would get the following output

 C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : False
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

This means that the numpy array named img is read-only and hence a deep copy is required for any write operation on this array. Hence we need to double the allocated memory every time we want to write on a numpy array loaded from a compressed nrrd file.

If the file is not compressed, e.g., its encoding is raw, the output is:

  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In this case, the returned numpy array is writeable, which is not consistent with the previous behavior.

The issue may come from line 446 in reader.py:

data = np.frombuffer(decompressed_data[byte_skip:], dtype)

If data is generated from a non-writeable buffer, np.frombuffer returns a read-only numpy array.

addisonElliott commented 5 years ago

Hello,

Thanks for the issue. I fixed something similar to this by switching from np.frombuffer to np.fromstring. You can see the PR/commit here: https://github.com/mhe/pynrrd/commit/6b158c62aac90fe7888781094b5222f3a217ec80

It will be a bit before I have some time to work on this myself, as always, PRs are welcome. Follow the same setup as my PR, add some unit tests that demonstrate the problem is fixed.

lguyot commented 5 years ago

Hello,

Many thanks for your prompt reply. You actually fixed the issue with 6b158c6 but you reverted this change when improving the performances of nrrd.read() in 16be776.

If np.fromstring() is not dramatically slower than np.frombuffer(), we would only need to reinstate this single line, namely 446, of reader.py as it was in 6b158c6.

addisonElliott commented 5 years ago

Hm, well the odd thing is that the unit tests should have failed after reverting that change.

lguyot commented 5 years ago

Indeed, they are all passing. I assumed mistakenly that I was using the latest version of pynrrd. But this is not the case: I am using pynrrd==0.2.5 where the issue can be reproduced.

This is not the case with version 0.4.0.

You can safely close this issue. I apologize for the trouble.

lguyot commented 5 years ago

The problem has been definitively fixed since version 0.3.4. My report is threfore not relevent. No action need to be taken.