AIDASoft / podio

PODIO
GNU General Public License v3.0
24 stars 60 forks source link

Restructure python parts, splitting code generation and python bindings into separate modules #511

Closed jmcarcell closed 12 months ago

jmcarcell commented 1 year ago

BEGINRELEASENOTES

ENDRELEASENOTES

Generation tools do not need to load or know anything about the rest of podio. I think the remaining question is if the tests have to be moved or not, some of them only test generation tools.

jmcarcell commented 1 year ago

I didn't move any tests - I mean moving tests from the current folder podio to the podio_gen folder; for example https://github.com/AIDASoft/podio/blob/master/python/podio/test_ClassDefinitionValidator.py only uses classes from podio_gen in this PR but it's in podio. It only matters from an organization point of view, and if any test is moved then there is some extra cmake code needed to also find those. On that I don't have an opinion to where this one goes, I don't think there are many more.

tmadlener commented 1 year ago

Ah, apologies. Misread the diff. I would move them, just to have them closer to the code they are testing.

jmcarcell commented 1 year ago

Now they are moved and are in a separate test since it doesn't seem it's easy to get them working in the same since they are now different modules

tmadlener commented 1 year ago

pre-commit complains about a cyclic import, but it doesn't really make sense to me, as the script should now only import from podio_gen, right?

https://github.com/AIDASoft/podio/actions/runs/6776714508/job/18418800193?pr=511#step:4:427

jmcarcell commented 1 year ago

I believe the diagnosis is correct; podio wants to import test_utils and test_utils wants to import podio. Actually the try - except block only seems to work during test time. I'm not sure why it appears now though

tmadlener commented 1 year ago

Ah then we can maybe switch to a

from .frame import Frame

in test_utils.py?

jmcarcell commented 1 year ago

That breaks some tests due to relative imports. Another proposal: move test_utils.py to podio/tests so that the tests that use it can use it directly. Then the import can be removed from __init__.py (and it makes sense that you don't always import something that you only need for tests, it just happens that the import doesn't work now at runtime). It's a bit less organized but anyway there are some python tests already there

tmadlener commented 1 year ago

I tried that, but that breaks with the weird c++ error in the comment (or at least it did in the past). I agree, this should really live in the tests directory though. Maybe you can give it another try and see if things just magically work now?

tmadlener commented 12 months ago

I think I found the issue. If I do

from podio.test_utils import Frame
print(Frame)

I get <class 'podio.frame.Frame'>. However, if I do

from podio import Frame
print(Frame)

I get <class cppyy.gbl.podio.Frame at 0x5569c6f3fc60>. So it seems like outside of the test_utils the importing of Frame does not pick up the wrapper, but rather the c++ class instead.

I am not yet sure why things are working inside test_utils as they are also simply doing a from podio import Frame. My best guess at the moment is that this is still happening before the sys.modules["podio"] = podio.

tmadlener commented 12 months ago

I have removed the sys.modules["podio"] = podio bit from the non-generator part __init__.py to make sure that we always get our wrapper classes in python rather than picking up c++ classes with the same name via the ROOT python bindings accidentally. This also makes it possible to move the test_utils more or less completely to the tests folder into write_frame.py and simply calling that for the different I/O backends.

I think in this case not having all of the c++ bits of podio directly available in python is OK, since for the main use cases we provide wrappers in any case and all the others are effectively mainly there to support the implementation of those, and I don't think we gain too much by exposing them to people in python.

Since this has now become a bit more than just "decoupling the generation files from the rest of the podio python files" I would actually also like to pick up #498 in this PR and make it another general overhaul of the python structure of podio. Mainly because #498 needs to be touched again in any case and we might as well resolve the merge conflict in this PR, rather than later.

jmcarcell commented 12 months ago

Looks good but now if you want to have a fully working podio in python you can't unless you do from ROOT import podio. I see some possible use cases like putting UserDataCollections in Frames or GenericParameters but probably there are few people using them.

tmadlener commented 12 months ago

I think GenericParameters should be seen as an internal implementation detail of how the Frame supports parameters.

UserDataCollections on the other hand are a good point. Maybe we can try to address them in another PR since Andre is waiting on this for the LCG stacks.