scikit-hep / numpythia

The interface between PYTHIA and NumPy
GNU General Public License v3.0
36 stars 11 forks source link

Single Filters Fail #20

Closed AlexSchuy closed 1 year ago

AlexSchuy commented 5 years ago

Hi. While using this package, I believe I have stumbled across a bug with the behavior of the filters. Consider the following code:

[. . .]
for event in pythia(events=n):
  selection = (STATUS == 1)
  selected_particles = event.all(selection)

selected_particles contains all particles from event, regardless of the status code. If you repeat the same condition twice, i.e., let selection = (STATUS == 1) & (STATUS == 1), selected_particles contains only particles with status code = 1, as expected.

I also tested this with PDG_ID and observed a similar error. If this is a difficult issue to fix, I would be happy to help in a few weeks when I have more time.

eduardo-rodrigues commented 5 years ago

Thanks for reporting, @AlexSchuy. Did you have time to dive a bit more and test things? It seems the relevant code is https://github.com/scikit-hep/numpythia/blob/master/numpythia/src/_libnumpythia.pyx#L187.

Seems the issue is general if you tried with different selections and found the same behaviour It would be great if you could submit a PR with a test for this issue ...

In any case: @marinang is the person who's been using and working more on this package. We struggle a bit to maintain ... Maybe he will have a better idea? I'm ignorant as far as Cython is concerned.

While at it: we have a forum and a Gitter channel as a means to get in touch with "all of us", see http://scikit-hep.org/get-in-touch.html. Maybe you can subscribe and/or enquire there too?

Thanks for the interest.

marinang commented 5 years ago

@AlexSchuy I have noticed that as well, always at least 2 filters... It was happening for me when I was selecting with particles IDs as the example in the README

w = e.last((ABS_PDG_ID == 24) & HAS_END_VERTEX))

without HAS_END_VERTEX it was not working, I have never thought of a solution though.

eduardo-rodrigues commented 5 years ago

Hmm, thinking out loud here: could it be that the object type does not match any of the if statements in the code I refer to above? I noticed that some selections are BooleanFilter whereas others are IntegerFilter, and the code (https://github.com/scikit-hep/numpythia/blob/master/numpythia/src/_libnumpythia.pyx#L197) does not consider the latter. For example:

cpdef IntegerFilter STATUS = IntegerFilter()
cpdef IntegerFilter PDG_ID = IntegerFilter()
cpdef BooleanFilter HAS_END_VERTEX = BooleanFilter()

Seems like a bug to a newbie such as me - it does not work with a single IntegerFilter but the combination with a BooleanFilter provides a BooleanFilter, I would guess.

If you have an installation could you fix this line 197, considering also IntegerFilter, and check that the issue is gone? The fix is probably for line 197 to become

if isinstance(selection, BooleanFilter) or isinstance(selection, IntegerFilter):

AlexSchuy commented 5 years ago

Thanks for the response, @eduardo-rodrigues and @marinang. I would like to cover two points with this comment: (1) my understanding of the current state of the filter code; and (2) why I think it might be better to address this problem in a different way.

(1) So, first let's discuss how the code works. An IntegerFilter by itself is not a selection. You need to compare it to an integer, to obtain a BooleanFilter. For example, if your IntegerFilter is ABS_PDG_ID, then you can get a boolean filter by writing ABS_PDG_ID == 24 (see https://github.com/scikit-hep/numpythia/blob/master/numpythia/src/_libnumpythia.pyx#L106). A selection, then, is either a single boolean filter or a compound boolean filter. This makes sense, as ultimately you need boolean conditions to decide whether or not an event passes.

The notion of a compound boolean filter is given by FilterList which is just a list of BooleanFilter. The design might seem a little strange, and in fact if you consult the HepMC repository (https://gitlab.cern.ch/hepmc/HepMC3/tree/master/search/include/HepMC3), you can see that FilterList was removed in the next version (3.1), in favor of overriding the boolean operators for Filter, which makes more sense.

In any case, when two BooleanFilter are ANDed, the result is a FilterList (see https://github.com/scikit-hep/numpythia/blob/master/numpythia/src/_libnumpythia.pyx#L92). So, since filtering fails with a single filter, I suspect the bug is in the handling of the case of a BooleanFilter.

Now we are back to https://github.com/scikit-hep/numpythia/blob/master/numpythia/src/_libnumpythia.pyx#L197. We see selection = FilterList(selection). It seems that the code creates a FilterList with a BooleanFilter as an argument. I don't quite understand this, as I see no constructor for FilterList at https://github.com/scikit-hep/numpythia/blob/master/numpythia/src/_libnumpythia.pyx#L66. Perhaps there is some under-the-hood magic in Cython going on there that I don't understand, or perhaps an explicit constructor with a BooleanFilter parameter needs to be added that correctly initializes _filterlist with the given filter.

(2) In (1), I suggested a possible fix. However, even if it works, it looks to me like you are still left without the ability to OR filters together. One solution would be to use a more recent version of HepMC. If that were done, rather than copying the code from HepMC into this package and then modifying it, I would suggest instead including https://gitlab.cern.ch/hepmc/HepMC3 as a submodule (see https://git-scm.com/book/en/v2/Git-Tools-Submodules). This lets changes made by numpythia developers be more easily pushed to HepMC, and can reduce the maintenance burden of numpythia by letting HepMC changes be more easily propagated to numpythia.

On a similar note, I see that there is another project https://github.com/scikit-hep/pyhepmc that is intended to bring HepMC bindings into python. But, much of numpythia is currently devoted to exactly this. If we could instead off-load that work to the pyhepmc project, it would in a similar fashion reduce the maintenance burden of numpythia, avoid duplication of effort, and improve pyhepmc for future projects.

Since the particle physics community is rather small, I think it is important to make these kind of efforts where doing so is reasonable.

marinang commented 5 years ago

@AlexSchuy If you look here https://github.com/scikit-hep/numpythia/blob/4064d5d85ccc95bf5a57b1cca6154c6248a10e70/numpythia/src/hepmc.pxd#L121 you see the constructor of FilterList which takes a Filter which is defined as

cdef extern from "HepMC/Search/Filter.h" namespace "HepMC":
    cdef cppclass Filter(FilterBase):
        Filter (const Filter&)
        Filter operator!()

    cdef const Filter HAS_END_VERTEX
    cdef const Filter HAS_PRODUCTION_VERTEX
    cdef const Filter HAS_SAME_PDG_ID_DAUGHTER
    cdef const Filter IS_STABLE
    cdef const Filter IS_BEAM

.

About pyhepmc this is something It would be nice to explore because it would be easier to maintain, but I don't see any examples yet on how to use except some tests.

However the Pythia8 event is converted to HepMC through https://github.com/scikit-hep/numpythia/blob/4064d5d85ccc95bf5a57b1cca6154c6248a10e70/numpythia/src/hepmc.pxd#L143, http://home.thep.lu.se/Pythia/pythia82html/HepMCInterface.html. So this converter/interface would need to be implemented in pyhepmc(or here ?).

Maybe @HDembinski can tell us the status of pyhepmc?

eduardo-rodrigues commented 5 years ago

Hi, I'm now stuck with other matters until late-ish afternoon. For now let me just cc @HDembinski as I agree that using his pyhepmc in the longer term is most probably the way to to. It would make this package lighter and easier to maintain.

eduardo-rodrigues commented 1 year ago

Hello @AlexSchuy, as I'm sure you appreciated from (lack of) activity and this thread, we're hardly in maintenance mode, with no real maintained. This is a package that is effectively deprecated, and I will state that in the README soon ... In fact I've been committing very recently towards a legacy release ...

At this point you are better off with the Pythia 8 bindings or, most likely better Pythonically and overall since covering much more, with a project such as https://github.com/impy-project/impy. @HDembinski is a key person there and can help you.

I'm hence closing this since it is clear that neither you nor anyone will develop this package further apart from trivialities.

HDembinski commented 1 year ago

Hi,

impy currently does not expose all functionality of Pythia8, if you want that, please use the official bindings. However, if you just want to generate some min-bias, then the easiest option is to use impy.

Note: impy is going be renamed to chromo soon, since the name impy on PyPI is not available.