Closed D-Callanan closed 5 years ago
@e-koch @astrofrog do either of you have any idea how this could be happening? There's a memory allocation error in mmap. I thought the whole point of memmap was to get around that?
@D-Callanan could you do df -h
just to verify that there is adequate hard drive space? I wonder also if this could be a problem with where tempfile
is putting the temporary file; you can specify the target directory (and make sure it's on a HD with space) using memmap_dir
:
https://github.com/radio-astro-tools/spectral-cube/blob/master/spectral_cube/spectral_cube.py#L2639
@D-Callanan it would also be useful to figure out what the limitations are on your machine. Something like this:
import numpy as np
import tempfile
for size in np.logspace(3,13,11):
ntf = tempfile.NamedTemporaryFile()
mm = np.memmap(ntf.name, mode='w+', shape=(int(size),), dtype=float)
print("Size 10^{0} succeeded".format(np.log10(size)))
and see when it fails. When I run this test, I get:
Size 10^3.0 succeeded
Size 10^4.0 succeeded
Size 10^5.0 succeeded
Size 10^6.0 succeeded
Size 10^7.0 succeeded
Size 10^8.0 succeeded
Size 10^9.0 succeeded
Size 10^10.0 succeeded
Size 10^11.0 succeeded
Size 10^12.0 succeeded
Traceback (most recent call last):
File "<ipython-input-14-17800c68f2c7>", line 5, in <module>
mm = np.memmap(ntf.name, mode='w+', shape=(int(size),), dtype=float)
File "/users/aginsbur/anaconda/envs/python3.6/lib/python3.6/site-packages/numpy/core/memmap.py", line 250, in __new__
fid.seek(bytes - 1, 0)
OSError: [Errno 22] Invalid argument
@D-Callanan - You could also run ulimit
and see what it is set to. A similar problem is mentioned here: https://github.com/xgcm/xgcm/issues/40.
These issues might be related: https://github.com/astropy/astropy/issues/1380 https://github.com/astropy/astropy/pull/7926
Also, @D-Callanan, could you check whether you're on a 32-bit or 64-bit system?
$ python -c "import sys; print(sys.maxsize)"
9223372036854775807
Thanks for the quick responses!
Running df -h
says there are 8.0TB available, so I don't think that's the issue. I've also manually set the memmap_dir
to that directory with no luck.
The output of the memory test @keflavich suggested is
Size 10^3.0 succeeded
Size 10^4.0 succeeded
Size 10^5.0 succeeded
Size 10^6.0 succeeded
Size 10^7.0 succeeded
Size 10^8.0 succeeded
Size 10^9.0 succeeded
Traceback (most recent call last):
File "mem_test.py", line 6, in <module>
mm = np.memmap(ntf.name, mode='w+', shape=(int(size),), dtype=float)
File "/home/dcallana/.local/lib/python2.7/site-packages/numpy/core/memmap.py", line 264, in __new__
mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
mmap.error: [Errno 12] Cannot allocate memory
ulimit
is apparently not a command on the machine I'm running my code on, so I'd assume a limit hasn't been set?
And finally, the result of python -c "import sys; print(sys.maxsize)"
is 9223372036854775807.
@D-Callanan are you certain it's 8.0 TB on the drive you're trying to allocate the space on? It might be, I just want to be certain.
I've double checked and of the 52T of the filesystem I'm using, 8T is free.
Could you print the output of cat /proc/meminfo
? That's another test from here.
AFAICT, we're using the correct memmap mode - np.memmap
mode 'w+'
, which maps to mmap.ACCESS_WRITE
, which means that no memory except the actual hard drive space should be allocated. In other words, spectral-cube
is doing the right thing.
Why can't @D-Callanan allocate more than 1 GB? I think this must be some sort of weird 32 bit OS limitation. @D-Callanan, I think it's time we contact the IT dept and ask them for help. There must be something 32-bit on their system that's blocking us, even though python is 64-bit.
This is the output of cat /proc/meminfo
:
MemTotal: 132038992 kB
MemFree: 21350768 kB
Buffers: 49992 kB
Cached: 34165592 kB
SwapCached: 234900 kB
Active: 90256516 kB
Inactive: 17838248 kB
Active(anon): 68363604 kB
Inactive(anon): 5547564 kB
Active(file): 21892912 kB
Inactive(file): 12290684 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 4194300 kB
SwapFree: 0 kB
Dirty: 80 kB
Writeback: 0 kB
AnonPages: 73644288 kB
Mapped: 63312 kB
Shmem: 31980 kB
Slab: 1303124 kB
SReclaimable: 1157060 kB
SUnreclaim: 146064 kB
KernelStack: 17984 kB
PageTables: 225040 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 70213796 kB
Committed_AS: 85599096 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 815116 kB
VmallocChunk: 34290372180 kB
HardwareCorrupted: 0 kB
AnonHugePages: 46608384 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 54272 kB
DirectMap2M: 1972224 kB
DirectMap1G: 132120576 kB
ok... you have plenty of vm.... wtf.
This issue has been partly resolved offline: there was a restriction imposed by the sysadmins limiting user allocated memory to <100 GB, including swap. It's unclear why that was blocking editing of ~8-20 GB files, but removing that restriction has apparently ameliorated the problem.
@D-Callanan can we close this now that the sysadmins found the issue?
@keflavich Yes, I believe so. Thank you for the help with this.
When I attempt to write a fits file after convolving the cube to a consistent beam size, I run into Errno 12, with the traceback:
I'm fairly certain I am not running into an issue with storage quota, and I've run hpy to determine the amount of memory I'm using before the troublesome line of code, with gives me:
The code that runs into this issue is:
The file is 13GB in size.