Closed P4ulW closed 11 months ago
Thanks for the report. When you say that this only occurs on Windows, does that mean you have specifically tested this on Linux or Macs and confirmed that memory is never allocated when printing the tree? If so, I will have some difficulty reproducing this since I don't have access to a Windows machine, but I could ask colleagues to check.
I have specifically tested this on Linux and I cannot reproduce this behavior in Linux (Fedora, Manjaro).
While investigating this issue, I also noticed that passing an array initialized with the np.zeros
-function shows differences in memory allocation on Windows an Linux. Maybe this is an issue caused by numpy
or hdf5
working differently between the different OSs?
Any news regarding this issue? We would like to integrate the NeXus-standard into our data pipeline, but this issue currently inhibits this severely.
Thanks for the prompt. It's tricky for me to look into this since I don't use a Windows machine. Would it be possible to produce a script to reproduce this, with calls to check the system memory allocation? I can then ask a colleague to try it on their machine.
This small script replicates the issue on my machine:
from nexusformat.nexus import *
import numpy as np
import os
from memory_profiler import profile
def clean_up():
try:
os.remove("test.nxs")
except: pass
@profile
def create_empty_nxsfile(nw, nx, ny, nz):
clean_up()
shape = (nw, nx, ny, nz)
signal = NXfield(
shape=shape,
dtype=np.int32,
name="signal")
signal.save("test.nxs")
nxf = nxload("test.nxs")
nxf.tree
if __name__ == "__main__":
create_empty_nxsfile(10, 100, 1000, 1000)
create_empty_nxsfile(10, 300, 1000, 1000)
create_empty_nxsfile(10, 500, 1000, 1000)
The issue should occur on the second call of create_empty_nxsfile
A colleague was able to reproduce the issue with the array with shape (10,300,1000,1000) and not the others. The only thing I noticed is that the chunk size, which is selected automatically by HDF5 by default, is a little anomalous. It's (1,7,63,63) when the second dimension is 100, (1,19,63,63) for 300, and (1,16,63,63) for 500. I have no idea why the chunk size is slightly larger for the smaller array, and the difference looks too small to have any effect, but it made me wonder if it was worth setting the chunk size manually just in case this is some arcane anomaly in the HDF5 code.
You can set the chunk size by calling. e.g., signal = NXfield(shape=shape, dtype=np.int32, name="signal", chunks=(10,50,50,50)
. I would be interested if this makes any difference.
The other thing to check is whether this increased memory occurs if you open the file directly in h5py. Obviously, you don't want to load the whole array, but I presume the memory increase should occur if you just open the dataset for inspection, e.g.,
import h5py
with h5py.File('test.nxs', 'r') as f:
print(f['entry/signal'].shape)
Another thought occurred to me. The nexusformat package will normally not attempt to load an array if its size is larger than 1000. If it's smaller, it is read into a private NXfield
attribute, _value
. Just to confirm this, it would be worth checking what _value
is after loading the data. It should be None
.
>>> nxf = nxload("test.nxs")
>>> print(nxf['entry/signal']._value)
None
If it's not None, please let me know.
Thank you for the investigations. Opening the file with HDF directly did not allocate any significant memory. Checking the _value
property revealed that it is None
for the shapes (10, 100, 1000, 1000)
and (10, 500, 1000, 1000)
and the fully initialized array for the anomalous case (10, 300, 1000, 1000)
.
So that seems to identify the problem, although I have no idea why a value of (10,300,1000,1000)
should trigger the array to be read into memory, and then only on Windows. This will be tricky for me to debug, but if I think of things that you could test, I will let you know.
A quick thing to check would be to operate on the NXFile object, created when you open the file. The following returns a tuple, whose first value should be None
unless it chose to read it into memory.
>>> from nexusformat.nexus import *
>>> f = NXFile('test.nxs')
>>> f.nxpath = '/entry/signal'
>>> f.readvalues()
(None, (10, 300, 1000, 1000), dtype('int32'), {})
The value should be None, unless np.prod(shape)
is less than 1000. That is literally the only check triggering a read. If a shape of (10, 300, 1000, 1000)
triggers the large memory allocation, let me know what np.prod((10,300,1000,1000))
gives (and check its value against the other shapes).
Tested this for different sizes, all with the same result: First value is always None
. np.prod
also gives the expected values.
I will have to see if I can get access to a debugger on Windows, because I can't see off-hand how a change of shape can have such a major effect. At least, we've made some progress.
I'm afraid that I will be away for 10 days, so I can't make progress on this issue for a little while. I don't know if you had a chance to check whether changing the chunk size made a difference. Another possibility is to make the NXfield with a smaller dimension, but with maxshape
set to a larger value.
NXfield(shape=(10,100,1000,1000), dtype=np.float32, maxshape=(10,300,1000,1000))
You can then resize the array when you have values to add beyond the original allocation (https://nexpy.github.io/nexpy/treeapi.html#nexusformat.nexus.tree.NXfield.resize).
Setting the chunk size manually did not alleviate the issue. Also using the maxshape
and resizing later unfortunately only postpones the issue until the next time the file is opened.
To clarify my last comment: Resizing works fine, but since the RAM-allocation occurs when tree
-method is called, the issue still persists in previously resized arrays upon opening.
Are there any news regarding the issue?
I did some digging myself. What I found is the following:
in _read_data
there is the if np.prod(field.shape) < 1000
check, which determines if the data is loaded or not. I added a simple print statement to see what happens:
field = self.get(self.nxpath)
# Read in the data if it's not too large
print("field lenght: ", np.prod(field.shape))
if np.prod(field.shape) < 1000: # i.e., less than 1k dims
try:
value = self.readvalue(self.nxpath)
except Exception:
value = None
Under windows for specific shapes, this is the value of the field lenght: | shape | output | should |
---|---|---|---|
(10, 100, 1000, 1000) | $10^9$ | $1\cdot10^9$ | |
(10, 300, 1000, 1000) | $-1294967296$ | $3\cdot10^9$ | |
(10, 500, 1000, 1000) | $705032704$ | $5\cdot10^9$ |
So there is some overflow occuring with np.prod
in windows for some reason. Which is quite worrisome.
This should solve it:
np.prod(np.array(field.shape).astype(np.int64))
For some reason integer numpy arrays are 32bit in windows ...
Sorry I have not been able to work on this issue, since I still don't have access to a Windows machine for debugging. However, I think you have identified the problem correctly. The solution is not, I believe, to cast the shape array as np.int64
, but to ensure that np.prod
returns a np.int64
value, i.e., np.prod(shape, dtype=np.int64)
. I have submitted and merged a PR to the nexusformat package, which modifies the _getsize
function to return a 64-bit integer. If you are able to test the latest development version in the main branch, we'll see if it's sufficient to resolve this issue. If it works, I should be able to release a new version quickly.
I tested the recent fix #206 , and the issue appears to be fixed as far as I can tell. Thank you for the fast implementation.
Great. I'll try to get it released in the next couple of days, once I've double-checked that there are no other side-effects.
When creating an empty, large
NXfield
object and saving the NeXus-file, only little memory is allocated. Loading the file withnxload
does in some cases consume a lot of memory, though it is not clear why. On my machine, the following example shows intransparent behavior:Depending on
shape
, callingnxf.tree
sometimes uses memory equivalent to theNXfield
size and sometimes not.shape = (10, 300, 1000, 1000)
causes thetree
call to allocated memoryshape = (10, 500, 1000, 1000)
causes no memory allocationshape = (10, 100, 1000, 1000)
also causes no memory allocationThis weird behavior only occurred on Windows so far (10 & 11). Nexus-Format Version :
nexusformat==1.0.2