LDMX-Software / ldmx-sw

The Light Dark Matter eXperiment simulation and reconstruction framework.
https://ldmx-software.github.io
GNU General Public License v3.0
20 stars 16 forks source link

Compiling stand-alone C++ Processors from Python Config #1308

Closed tomeichlersmith closed 5 days ago

tomeichlersmith commented 2 weeks ago

While writing https://ldmx-software.github.io/analysis/ldmx-sw.html I realized that it is relatively easy to write a stand-alone, single-file analyzer (or producer for that matter) due to our library loading mechanism. I also realized that, since the library loading happens after the python config has been fully run, we can do the compilation of the stand-alone single-file processor within the python config.

For this to work, I currently have to write my own class that mimics the LDMX.Framework.ldmxcfg.EventProcessor class but just omits the addModule within the constructor. (This is done in the linked documentation.)

https://github.com/LDMX-Software/ldmx-sw/blob/11d6a8e89250b357885602075c88b423dc04cdeb/Framework/python/ldmxcfg.py.in#L34-L39

If we updated EventProcessor to be something like


from pathlib import Path
import subprocess

class EventProcessor:
    def __init__(self, instanceName, className, library):
        self.instanceName = class_name
        self.className = class_name
        self.histograms = []

        if library.endswith('.so'):
            # assume user is passing a filepath to load
            Process.addLibrary(library)
        else:
            # assume user has just passed the name of the module the class is compiled into
            Process.addModule(library)

    def from_file(fp, class_name = None, instance_name = None, **config_kwargs):
        if not isinstance(fp, Path):
            fp = Path(fp)
        if not fp.is_file():
            raise ValueError(f'{fp} is not accessible.')

        src = fp.resolve()
        if class_name is None:
            # assume class name is name of file if not provided
            class_name = fp.stem
        if instance_name is None:
            # use class name for instance name if not provided
            instance_name = class_name

        lib = src.parent / f'lib{src.stem}.so'
        if not lib.is_file() or src.stat().st_mtime > lib.stat().st_mtime:
            print('Analyzer source is newer than the library, recompiling...') # could make this optional
            import subprocess
            subprocess.run([
              'g++', '-fPIC', '-shared', '-o', str(lib.resolve()),
              '-lFramework', '-I/usr/local/include/root', str(src.resolve())
            ], check=True)

        instance = EventProcessor(instance_name, class_name, str(lib.resolve()))
        for cfg_name, cfg_val in config_kwargs:
            setattr(instance, cfg_name, cfg_val)
        return instance

Then we could have configs loading single-file processors. For example

// in MyAnalyzer.cxx
#include "Framework/EventProcessor.h"

#include "Ecal/Event/EcalHit.h"

class MyAnalyzer : public framework::Analyzer {
 public:
  MyAnalyzer(const std::string& name, framework::Process& p)
    : framework::Analyzer(name, p) {}
  ~MyAnalyzer() = default;
  void onProcessStart() final;
  void analyze(const framework::Event& event) final;
};

void MyAnalyzer::onProcessStart() {
  // this is where we will define the histograms we want to fill
}

void MyAnalyzer::analyze(const framework::Event& event) {
  // this is where we will fill the histograms
}

DECLARE_ANALYZER(MyAnalyzer);

could be run with

// config.py
from LDMX.Framework import ldmxcfg
p.sequence = [ ldmxcfg.Analyzer.from_file('MyAnalyzer.cxx') ]
p.inputFiles = [ 'events.root' ]
p.histogramFile = 'hist.root'
tvami commented 2 weeks ago

I think this would be awesome!

EinarElen commented 2 weeks ago

Looks great!