LDMX-Software / ldmx-sw

The Light Dark Matter eXperiment simulation and reconstruction framework.
GNU General Public License v3.0
20 stars 16 forks source link

Compiling stand-alone C++ Processors from Python Config #1308

Closed tomeichlersmith closed 5 days ago

tomeichlersmith commented 2 weeks ago

While writing https://ldmx-software.github.io/analysis/ldmx-sw.html I realized that it is relatively easy to write a stand-alone, single-file analyzer (or producer for that matter) due to our library loading mechanism. I also realized that, since the library loading happens after the python config has been fully run, we can do the compilation of the stand-alone single-file processor within the python config.

For this to work, I currently have to write my own class that mimics the LDMX.Framework.ldmxcfg.EventProcessor class but just omits the addModule within the constructor. (This is done in the linked documentation.)


If we updated EventProcessor to be something like

from pathlib import Path
import subprocess

class EventProcessor:
    def __init__(self, instanceName, className, library):
        self.instanceName = class_name
        self.className = class_name
        self.histograms = []

        if library.endswith('.so'):
            # assume user is passing a filepath to load
            # assume user has just passed the name of the module the class is compiled into

    def from_file(fp, class_name = None, instance_name = None, **config_kwargs):
        if not isinstance(fp, Path):
            fp = Path(fp)
        if not fp.is_file():
            raise ValueError(f'{fp} is not accessible.')

        src = fp.resolve()
        if class_name is None:
            # assume class name is name of file if not provided
            class_name = fp.stem
        if instance_name is None:
            # use class name for instance name if not provided
            instance_name = class_name

        lib = src.parent / f'lib{src.stem}.so'
        if not lib.is_file() or src.stat().st_mtime > lib.stat().st_mtime:
            print('Analyzer source is newer than the library, recompiling...') # could make this optional
            import subprocess
              'g++', '-fPIC', '-shared', '-o', str(lib.resolve()),
              '-lFramework', '-I/usr/local/include/root', str(src.resolve())
            ], check=True)

        instance = EventProcessor(instance_name, class_name, str(lib.resolve()))
        for cfg_name, cfg_val in config_kwargs:
            setattr(instance, cfg_name, cfg_val)
        return instance

Then we could have configs loading single-file processors. For example

// in MyAnalyzer.cxx
#include "Framework/EventProcessor.h"

#include "Ecal/Event/EcalHit.h"

class MyAnalyzer : public framework::Analyzer {
  MyAnalyzer(const std::string& name, framework::Process& p)
    : framework::Analyzer(name, p) {}
  ~MyAnalyzer() = default;
  void onProcessStart() final;
  void analyze(const framework::Event& event) final;

void MyAnalyzer::onProcessStart() {
  // this is where we will define the histograms we want to fill

void MyAnalyzer::analyze(const framework::Event& event) {
  // this is where we will fill the histograms


could be run with

// config.py
from LDMX.Framework import ldmxcfg
p.sequence = [ ldmxcfg.Analyzer.from_file('MyAnalyzer.cxx') ]
p.inputFiles = [ 'events.root' ]
p.histogramFile = 'hist.root'
tvami commented 2 weeks ago

I think this would be awesome!

EinarElen commented 2 weeks ago

Looks great!