In ROOT,gDirectory is a global variable that points to the currently open file in the program. Whenever a method such as Write() is called, e.g. when writing an histogram object in a file, the gDirectory will implicitly tell where that file is, without the need of specifying the file path when saving/writing.
Within a distributed event loop, the input files of the RDataFrame are opened to retrieve the information about their clusters. It so happens that whenever a new call to TFile or similar is issued, the gDirectory variable will be changed to point to the last file opened. This happens both in C++ and in Python.
The TDirectory::TContext class serves the purpose of storing the current gDirectory when instantiated, then restoring it when destroyed. This means that no matter how many times the gDirectory has been changed, it will be restored to its initial value if a TContext was set.
In the Dist class the gDirectory is changed in the scope of the get_clusters() method. After the reduce phase of the process, if a call to Write() or similar was issued, then it won't correctly save the ROOT object to the initial file because the gDirectory was reset in the event loop.
In C++ this could be avoided by explicitly creating a TContext at the beginning of the program and then calling its destructor at the end. In Python, a simple del TContext wouldn't guarantee a call to the C++ destructor. Instead, a context manager would enable a better management of the creation and the destruction of the TContext. Fortunately, PyROOT offers a pythonization that enables the call of the C++ destructor of any ROOT object through the __destruct__() dunder method.
To summarise:
A new factory function named managed_tcontext has been added in the Proxy module. This function is decorated with contextlib.contextmanager to make it work in a Python context manager without the need for a full class with __enter__ and __exit__ methods. This function creates a TContext at the beginning of the with statement and finally destroys it when exiting the context.
A new test has been added. It creates a file with some data and an empty one. The data is loaded in an RDataFrame and then an histogram is created from it. The histogram finally gets written to the empty file, then retrieved back from it to check that it was correctly saved.
In ROOT,
gDirectory
is a global variable that points to the currently open file in the program. Whenever a method such asWrite()
is called, e.g. when writing an histogram object in a file, the gDirectory will implicitly tell where that file is, without the need of specifying the file path when saving/writing.Within a distributed event loop, the input files of the RDataFrame are opened to retrieve the information about their clusters. It so happens that whenever a new call to
TFile
or similar is issued, thegDirectory
variable will be changed to point to the last file opened. This happens both in C++ and in Python.The
TDirectory::TContext
class serves the purpose of storing the currentgDirectory
when instantiated, then restoring it when destroyed. This means that no matter how many times thegDirectory
has been changed, it will be restored to its initial value if aTContext
was set.In the
Dist
class thegDirectory
is changed in the scope of theget_clusters()
method. After the reduce phase of the process, if a call toWrite()
or similar was issued, then it won't correctly save the ROOT object to the initial file because thegDirectory
was reset in the event loop.In C++ this could be avoided by explicitly creating a
TContext
at the beginning of the program and then calling its destructor at the end. In Python, a simpledel TContext
wouldn't guarantee a call to the C++ destructor. Instead, a context manager would enable a better management of the creation and the destruction of theTContext
. Fortunately, PyROOT offers a pythonization that enables the call of the C++ destructor of any ROOT object through the__destruct__()
dunder method.To summarise:
managed_tcontext
has been added in theProxy
module. This function is decorated withcontextlib.contextmanager
to make it work in a Python context manager without the need for a full class with__enter__
and__exit__
methods. This function creates a TContext at the beginning of thewith
statement and finally destroys it when exiting the context.