scikit-hep / uproot5

ROOT I/O in pure Python and NumPy.
https://uproot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
235 stars 75 forks source link

Calling uproot.open many times uses up all available threads #103

Closed kratsg closed 4 years ago

kratsg commented 4 years ago

Not sure quite how to explain it, but it seems like the reliance on numpy's mmap is causing an inability to handle opening many files at once.

$ python
>>> import uproot
u>>> uproot.version.version
'3.12.0'
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> exit()

versus

>>> import uproot4 as uproot
>>> uproot.version.version
'0.0.23'
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
Traceback (most recent call last):
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/source/file.py", line 110, in __init__
    self._file = numpy.memmap(self._file_path, dtype=self._dtype, mode="r")
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/numpy/core/memmap.py", line 264, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
OSError: [Errno 12] Cannot allocate memory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/reading.py", line 142, in open
    **options  # NOTE: a comma after **options breaks Python 2
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/reading.py", line 537, in __init__
    file_path, **self._options  # NOTE: a comma after **options breaks Python 2
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/source/file.py", line 117, in __init__
    file_path, **opts  # NOTE: a comma after **opts breaks Python 2
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/source/file.py", line 246, in __init__
    [FileResource(file_path) for x in uproot4._util.range(num_workers)]
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/source/futures.py", line 351, in __init__
    worker.start()
  File "/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/threading.py", line 846, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
>>> exit()
kratsg commented 4 years ago

Hrmm @nsmith- and the Coffea team saw this in CoffeaTeam/coffea#115 but I thought this was a uproot3 specific issue, rather than a uproot4 issue.. I guess not.

kratsg commented 4 years ago

There does seem to be a nicer API for this, using file_handler from https://uproot4.readthedocs.io/en/latest/uproot4.reading.open.html#uproot4.reading.open however:

>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.source.file.FileResource)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/reading.py", line 142, in open
    **options  # NOTE: a comma after **options breaks Python 2
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/reading.py", line 537, in __init__
    file_path, **self._options  # NOTE: a comma after **options breaks Python 2
TypeError: __init__() got an unexpected keyword argument 'file_handler'

this crashes because uproot.source.chunk.Source is inherited from object

https://github.com/scikit-hep/uproot4/blob/117037a62a2ea8e6bbc5250326974df57a2f7190/uproot4/reading.py#L536-L538

which doesn't allow keyword arguments

>>> uproot.source.chunk.Source("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.source.file.FileResource)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object() takes no parameters
jpivarski commented 4 years ago

Hrmm @nsmith- and the Coffea team saw this in CoffeaTeam/coffea#115 but I thought this was a uproot3 specific issue, rather than a uproot4 issue.. I guess not.

There is a similarity in that both memory per process and number of threads per process are limited resources that can be controlled by ulimit, so the reproducibility of this issue depends on whether your system's ulimit puts a cap on the number of threads.

I'm taking Python at its word when it says

RuntimeError: can't start new thread

that this is a number of threads issue. Since opening files spawns threads, that's entirely plausible.

Uproot 4 Sources handle parallelization internally.

I believe that NFS doesn't support memory-mapping, and this would be a reason why it would fall back on MultithreadedFileSource. A way to check for this for sure would be

>>> a.file.source.fallback

to see if the MemmapSource (a.file.source) has a fallback or not. It's strange that Uproot 3 managed to open NFS files without complaining that it couldn't open the memory-map: I don't remember putting in fallback logic. (In Uproot 3, I think the equivalent of the above is a._context.source.)

Anyway, the solution to the problem is to properly close the files. For these Uproot 4 Sources, "closing" means closing all file handles and shutting down threads: the file handles and threads are glued to each other in a ResourceThreadPoolExecutor. The context-management semantics works at every level: if you do

>>> with uproot4.open("/path/to/file.root:/path/to/tree") as tree:
...     do_something_with(tree)
...
>>> # the file and its threads are now gone

because file handles and threads are both limited resources that need to be fenced in user code to define their lifetimes, not reliant on the garbage collector, which only triggers when an unrelated and generally more plentiful resource (memory) runs out.

As for passing in an explicit file_handler (or http_handler, xrootd_handler, etc.), that seems to work:

>>> f = uproot4.open("../uproot/tests/samples/simple.root", file_handler=uproot4.MemmapSource)
>>> f = uproot4.open("../uproot/tests/samples/simple.root", file_handler=uproot4.MultithreadedFileSource)

You were passing in a Resource class instead of a Source class. Once again, the error message that Python generated (because the Resource constructor doesn't take these arguments) doesn't say what the real problem is, which is that the user-supplied argument is of the wrong type. I'll have to add a type-guard to that. (I don't think I can go the route of MyPy-typing everything because so much of Uproot is dynamically generated, of necessity because we don't know what classes we'll find in each ROOT file.)

jpivarski commented 4 years ago

Oh, and this Source:

https://github.com/scikit-hep/uproot4/blob/117037a62a2ea8e6bbc5250326974df57a2f7190/uproot4/reading.py#L533-L538

is not the Source class:

https://github.com/scikit-hep/uproot4/blob/117037a62a2ea8e6bbc5250326974df57a2f7190/uproot4/source/chunk.py#L39-L48

It was failing because FileResource doesn't have those arguments.

https://github.com/scikit-hep/uproot4/blob/117037a62a2ea8e6bbc5250326974df57a2f7190/uproot4/source/file.py#L26-L39


The give-away, if you knew my convention, is that I never use unqualified names in a codebase. I would never

from uproot4.source.chunk import Source

at the top of a file like uproot4/reading.py, even if it used Source all over the place. I'd always write uproot4.source.chunk.Source, even though it makes the lines of code wide (and imposing a line width puts this under strain).

That's not a rule I've seen written down anywhere, but I gradually adopted it over the years because it's been incredibly useful to be able to trace any object back to its definition through its name. It's something I especially wish the Numba codebase did, as I've had to figure that out to write Numba extensions (with features beyond the documented examples).

I should write that down as a rule in the CONTRIBUTING.md. I guess it's already a rule in Awkward's CONTRIBUTING.md.

kratsg commented 4 years ago

I believe that NFS doesn't support memory-mapping, and this would be a reason why it would fall back on MultithreadedFileSource. A way to check for this for sure would be

>>> a.file.source.fallback

to see if the MemmapSource (a.file.source) has a fallback or not. It's strange that Uproot 3 managed to open NFS files without complaining that it couldn't open the memory-map: I don't remember putting in fallback logic. (In Uproot 3, I think the equivalent of the above is a._context.source.)

I'm not seeing a fallback here.

>>> import uproot4 as uproot
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root")
>>> a.file.source.fallback
>>> a.file.source
<MemmapSource '...mc16a.root' at 0x7f8265715780>

Trying the fixed way of specifying the file_handler seems to work:

>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource)
>>> a.file.source
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f825a1cdd68>

Is there a way to... open files without requiring threads explicitly? Or is this a change in uproot4 for the better(?)?

jpivarski commented 4 years ago

The MemmapSource (default) does not require threads. I'm a little confused as to why switching from MemmapSource, with no background threads, to MultithreadedFileSource, with 1 background thread (because num_workers is 1 by default) is removing the number of threads limitation. The num_fallback_workers is by default 10, so if MemmapSource is falling back, then the 10 vs 1 would explain it...

Uproot 3 had a multithreaded physical layer as well. Uproot 4 gives you the new option of passing a file-like object, which uses ObjectSource. That's one background thread, though. I think the MemmapSource is the only one that spawns zero threads.

kratsg commented 4 years ago

7 threads and then crash.

Python 3.6.5 (default, Jun 15 2019, 23:43:55) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import uproot4 as uproot
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root"); print(a.file.source)
<MemmapSource '...mc16a.root' at 0x7f3670f1a780>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root"); print(a.file.source)
<MemmapSource '...mc16a.root' with fallback at 0x7f3665723a58>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root"); print(a.file.source)
<MemmapSource '...mc16a.root' with fallback at 0x7f366572beb8>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root"); print(a.file.source)
<MemmapSource '...mc16a.root' with fallback at 0x7f3665746390>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root"); print(a.file.source)
<MemmapSource '...mc16a.root' with fallback at 0x7f3665751828>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root"); print(a.file.source)
<MemmapSource '...mc16a.root' with fallback at 0x7f366575bcc0>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root"); print(a.file.source)
<MemmapSource '...mc16a.root' with fallback at 0x7f3664ce9198>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root"); print(a.file.source)
Traceback (most recent call last):
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/source/file.py", line 110, in __init__
    self._file = numpy.memmap(self._file_path, dtype=self._dtype, mode="r")
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/numpy/core/memmap.py", line 264, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
OSError: [Errno 12] Cannot allocate memory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/reading.py", line 142, in open
    **options  # NOTE: a comma after **options breaks Python 2
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/reading.py", line 537, in __init__
    file_path, **self._options  # NOTE: a comma after **options breaks Python 2
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/source/file.py", line 117, in __init__
    file_path, **opts  # NOTE: a comma after **opts breaks Python 2
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/source/file.py", line 246, in __init__
    [FileResource(file_path) for x in uproot4._util.range(num_workers)]
  File "/gpfs/slac/atlas/fs1/u/gstark/collinearw/py3/lib/python3.6/site-packages/uproot4/source/futures.py", line 351, in __init__
    worker.start()
  File "/cvmfs/sft.cern.ch/lcg/releases/Python/3.6.5-f74f0/x86_64-centos7-gcc8-opt/lib/python3.6/threading.py", line 846, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
>>> exit()

versus MultiThreaded (no crash).

Python 3.6.5 (default, Jun 15 2019, 23:43:55) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import uproot4 as uproot
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c0dfbe780>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027c7c18>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027c9898>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027d9518>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027e3198>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027e3dd8>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027eda58>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027ed390>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027e37b8>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027d9320>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027f1438>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027f1f98>
>>> a = uproot.open("/nfs/slac/atlas/fs1/d/yuzhan/collinearw_files/June2020_Production/merged_files/Wj_AB212108_v2_mc16a.root", file_handler=uproot.MultithreadedFileSource); print(a.file.source)
<MultithreadedFileSource '...mc16a.root' (1 workers) at 0x7f2c027fbcf8>
>>> exit()

The conclusion, to me, seems that defaulting to MultiThreaded is better for us on the SLAC computers we're using.

jpivarski commented 4 years ago

Aha! I missed earlier that the "can't start new thread" was a chained exception from "cannot allocate memory". In that case, it might not have anything to do with having a limited number of threads but the way memory-maps use memory. That can be OS dependent.

Anyway, that's precisely why we have alternatives: the MultithreadedFileSource is precisely for cases where a MemoryMappedSource can't be used. (It was a late addition to old Uproot, in response to cases where memory-maps didn't work for some reason. I thought NFS was one of those reasons, but maybe that depends on NFS version.)

So if it works, use the MultithreadedFileSource. You might want to look at num_workers as an option to the uproot4.open function (and uproot4.iterate, etc.) because the issue really was about memory and maybe you can afford more threads. This directly affects parallelism on the physical layer (https://uproot4.readthedocs.io/en/latest/basic.html#parallel-processing), allowing multiple TBaskets to be in flight from disk to RAM while decompressing/interpreting the TBaskets that have already been loaded.

(That's why I liked memory-maps in the first place: it's a rapid and stateless way to access bytes on disk, hiding disk latencies when parallelized with decompression/interpretation.)

kratsg commented 4 years ago

So if it works, use the MultithreadedFileSource. You might want to look at num_workers as an option to the uproot4.open function (and uproot4.iterate, etc.) because the issue really was about memory and maybe you can afford more threads. This directly affects parallelism on the physical layer (https://uproot4.readthedocs.io/en/latest/basic.html#parallel-processing), allowing multiple TBaskets to be in flight from disk to RAM while decompressing/interpreting the TBaskets that have already been loaded.

I think this clarifies it, but also should probably be documented. For what it's worth, SLAC is a gpfs rather than a regular old nfs (and there's lots of peculiariaties with that anyway).

jpivarski commented 4 years ago

Well, there's this in the Getting Started Guide: https://uproot4.readthedocs.io/en/latest/basic.html#parallel-processing

And the open function talks about the parameters: https://uproot4.readthedocs.io/en/latest/uproot4.reading.open.html

I'm hoping that all of these things are covered now.

nsmith- commented 4 years ago

I think the original exception cannot allocate memory indicates it is the same issue that I ran into, as Giordon already identified. Each mmap counts against a ulimit, and if excessive ones are instantiated (and possibly held after use due to reference cycles) this may be the problem. You can watch your process's vsize in htop for example and see if it grows quickly.