scikit-hep / pyhepmc

Easy-to-use Python bindings for HepMC3
https://scikit-hep.org/pyhepmc/
BSD 3-Clause "New" or "Revised" License
21 stars 11 forks source link

Examples for writing data #71

Open jacanchaplais opened 9 months ago

jacanchaplais commented 9 months ago

Hi there! I see that there is good coverage in the docs for reading data from HepMC files, but I can't see much information on how to write data to one. I also notice the docs say the library is "numpy-friendly", but I can't really see much in the way of numpy integration.

I have my simulation data in numpy arrays, and would love a way of writing out to file objects in the HepMC format. I tried using the GenEvent.from_hepevt() method, since it seemed to take in the numpy data and spit out a GenEvent instance, but I kept getting confusing errors

TypeError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 hepmc.GenEvent.from_hepevt(
      2     event_number=0,
      3     px=graph.pmu.x,
      4     py=graph.pmu.y,
      5     pz=graph.pmu.z,
      6     en=graph.pmu.energy,
      7     m=graph.pmu.mass,
      8     pid=graph.pdg.data.astype(np.int32),
      9     status=graph.status.data.astype(np.int32),
     10     parents=event.mothers,
     11     children=event.daughters,
     12     fortran=False,
     13 )

TypeError: from_hepevt(): incompatible function arguments. The following argument types are supported:
    1. (self: pyhepmc._core.GenEvent, event_number: int, px: numpy.ndarray[numpy.float64], py: numpy.ndarray[numpy.float64], pz: numpy.ndarray[numpy.float64], en: numpy.ndarray[numpy.float64], m: numpy.ndarray[numpy.float64], pid: numpy.ndarray[numpy.int32], status: numpy.ndarray[numpy.int32], parents: object = None, children: object = None, vx: object = None, vy: object = None, vz: object = None, vt: object = None, fortran: bool = True) -> None

Invoked with: kwargs: event_number=0, px=array([0.        , 0.        , 0.        , ..., 1.559276  , 3.2770876 ,
       5.43204363]), py=array([0.        , 0.        , 0.        , ..., 1.20976157, 2.72396556,
       4.74063254]), pz=array([ 6.49999993e+03, -6.49999993e+03,  9.98237131e+02, ...,
        4.51464275e+00,  9.52191620e+00,  1.60423752e+01]), en=array([6.50000000e+03, 6.50000000e+03, 9.98237131e+02, ...,
       4.92715576e+00, 1.04319787e+01, 1.75880214e+01]), m=array([9.38269999e-01, 9.38269999e-01, 0.00000000e+00, ...,
       9.35541920e-08, 1.92514502e-07, 0.00000000e+00]), pid=array([2212, 2212,   21, ...,   22,   22,   22], dtype=int32), status=array([-12, -12, -21, ...,  91,  91,  91], dtype=int32), parents=array([[   0,    0],
       [   0,    0],
       [   7,    7],
       ...,
       [1532,    0],
       [1533,    0],
       [1533,    0]], dtype=int32), children=array([[384,   0],
       [385,   0],
       [  5,   6],
       ...,
       [  0,   0],
       [  0,   0],
       [  0,   0]], dtype=int32), fortran=False

As far as I can tell I added everything correctly. I found this from the HepMC3 GitLab for converting directly from a pythia8.Pythia instance, but it's a bit hard to read, given there's no docs, type annotations, and the whole method for writing it is done in one monolithic chunk. I've managed to make it work, but will take a while unpacking what's going on to be able to just throw in numpy data.

Can you offer any help? Examples or advice greatly appreciated.

jacanchaplais commented 9 months ago

Ah, so I've been poking around the test files, and found this: https://github.com/scikit-hep/pyhepmc/blob/main/tests/test_from_hepevt.py#L17 So I gather that it's not actually a constructor classmethod.

The naming here is a bit confusing, as .from_xxx() always implies to me it's a classmethod, rather than an instance method which does some inplace operation.

Anyway, I'm sorted now, thank you! (I'll leave open in case you want to add some more docs for numpy stuff etc.)

HDembinski commented 8 months ago

Sorry for the late reply. I agree that it would be more pythonic if from_hepevt was a classmethod. I am not sure if there is a way to fix this at this point, but if it is possible to have both a classmethod and a method with the same name then I am happy to fix this. The potential classmethod from_hepevt then needs two additional optional arguments momentum_unit and length_unit.

Perhaps import_hepevt would have been a better name?

The numpy-friendlyness is discussed here: https://scikit-hep.org/pyhepmc/examples/processing.html

HDembinski commented 8 months ago

For the record, this works

class Foo:
    def from_bar(*args, **kwargs):
        if isinstance(args[0], Foo):
            print("method", args[1:], kwargs)
        else:
            print("classmethod", args, kwargs)
jacanchaplais commented 7 months ago

Sorry I missed this! I can see from the docs on the numpy stuff, it's mainly regarding computation rather than IO, which is a bit different to my use-case, but cool neverthless. :)

Also, fun example! Though, I think this would be closer to accessing it as a staticmethod? Decorated classmethods receive the class type constructor as the first argument, whereas if you were to do

Foo.from_bar("spam")

The first argument here would be "spam", not the Foo constructor. So, it seems more like a staticmethod, to me. I guess you could still use it as though it were one, but as in your example, you would need to specifically point it to Foo. Probably that's not an issue, unless you subclass it. :)