NWChemEx / .github

GitHub Settings for the NWChemEx Organization
https://nwchemex.github.io/.github/
Apache License 2.0
1 stars 2 forks source link

Remove duplicate python module name #31

Open ryanmrichard opened 2 years ago

ryanmrichard commented 2 years ago

Right now our Python API looks like:

import nwchemex

mm = nwchemex.chemist.pluginplay.ModuleManager()
nwchemex.nwchemex.load_modules(mm)

Namely each top-level Python module keeps its functionality in a submodule of the same name.

I think it makes more sense for the API to be:

import nwchemex

mm = nwchemex.chemist.pluginplay.ModuleManager()
nwchemex.load_modules(mm)

I assume this is an either an issue with the generated __init__.py file or the CMake module for creating the Python bindings.

wadejong commented 2 years ago

This must be different from from nwchemex import * . Here, I can simply do mm = pluginplay.ModuleManager()

wadejong commented 2 years ago

See [https://www.codingem.com/python-difference-between-import-and-from-import/] for the difference between them.

ryanmrichard commented 2 years ago

The issue is referring to the nwchemex.nwchemex.load_modules(...) vs. nwchemex.load_modules(...) syntax. We shouldn't have to do from nwchemex import nwchemx (or from nwchem import *) to get the nwchemex.load_modules(...) syntax.

wadejong commented 2 years ago

Agreed. Let me investigate.

wadejong commented 2 years ago

Same solution as #34 .

wadejong commented 2 years ago

import nwchemex imports the package including the nwchemex class and

dir(nwchemex) shows what is loaded 'dir(nwchemex.nwchemex)` shows what functions are available

from nwchemex import nwchemex imports the classes in nwchemex class

from nwchemex import chemist imports the classes in chemist

There is not much we can do about this, this is how the Python structure works.

ryanmrichard commented 2 years ago

from nwchemex import nwchemex imports the classes in nwchemex class

FTR, there is no nwchemex class.

To reiterate, the problem here is our package setup not how Python works.

I expect:

import nwchemex
nwchemex.load_modules()

to work. I should not have to do:

from nwchemex import nwchemex
nwchemex.load_moddules()

The former code snippet imports the nwchemex package, the latter imports the nwchemex.nwchemex package (which is a subpackage of the nwchemex package). If it helps to think of it in C++ terms, the structure we are mirroring looks like nwchemex::load_modules(); not nwchemex::nwchemex::load_modules().

wadejong commented 2 years ago

You are correct, I misspoke.

import nwchemex imports the package namespace that includes the nwchemex namespace

from nwchemex import nwchemex imports the classes in nwchemex namespace

ryanmrichard commented 2 years ago

You are correct, I misspoke.

To be clear are you saying that you agree that it shouldn't be from nwchemex import nwchemex?

wadejong commented 2 years ago

It's undesirable, but this is a Python feature. To do what you want we need to tell cppyy to ignore the nwchemex namespace (for example).

ryanmrichard commented 2 years ago

End of the day, this syntax is not normal. Users do:

import numpy 
import math
my_array = numpy.ndarray()
two = math.sqrt(4)

not:

from numpy import numpy
from math import math
my_array = numpy.ndarray()
two = math.sqrt(4)

As for resolving this issue, if we have to provide a wrapper to get the syntax to be from nwchemex import load_modules that's fine with me, but the user should not have to do from nwchemex import nwchemex it's non-intuitive, it doesn't mirror the C++, and it looks weird.

wadejong commented 2 years ago

Will see if there is a way around this. For reference: https://realpython.com/python-modules-packages/

wadejong commented 2 years ago

What is your issue with from nwchemex import * ???

from nwchemex import *

mm = pluginplay.ModuleManager()
nwchemex.load_modules(mm)

mol = chemist.Molecule()

With this all of it works, and you can directly access all the modules. There is no way to effectively remove the extra namespace. And, adding a wrapper to initialize only adds another level of namespace.

ryanmrichard commented 2 years ago

First from x import * is in general frowned upon; for various reasons (notably performance) modern Python pushes IWYU (include what you use). Second, in our present case I think that's literally the same as from nwchemex import nwchemex which we agreed is non ideal.

FWIW, I'm pretty sure this entire issue can be resolved with a one line modification to the __init__.py file.

wadejong commented 2 years ago

Actually, I see where @ryanmrichard was going with the init.py discussion. I think I overlooked what I think he is referring to.

Bert

ryanmrichard commented 2 years ago

I was wrong. I don't think this can be fixed in one line. There's more going on here than I appreciated. I thought we could do something like:

from cppy.gbl.parallelzone import *

or (since cppy.gbl isn't a package)

from cppy.gbl import parallelzone
from parallelzone import *

to strip off the extra parallelzone.

However, while testing my solution I now see that when you do from parallelzone import parallelzone you're actually importing a class called parallelzone_meta that lives in the parallelzone package. The members of that class are the contents of the parallelzone package. This is very unexpected behavior to me.

ryanmrichard commented 2 years ago

I'm incredibly confused by what's going on because I don't understand how import parallelzone imports a class called parallelzone_meta I would have expected the line to be:

from parallelzone import parallelzone_meta
wadejong commented 2 years ago

This seems to work:

import cppyy.gbl.parallelzone as parallelzone

ryanmrichard commented 2 years ago

Working off of https://github.com/NWChemEx-Project/ParallelZone/issues/75, your solution doesn't work for me:

[ctest] ImportError: Failed to import test module: test_parallelzone
[ctest] Traceback (most recent call last):
[ctest]   File "/usr/lib/python3.10/unittest/loader.py", line 470, in _find_test_path
[ctest]     package = self._get_module_from_name(name)
[ctest]   File "/usr/lib/python3.10/unittest/loader.py", line 377, in _get_module_from_name
[ctest]     __import__(name)
[ctest]   File "/home/ryan/workspaces/NWX/ParallelZone/tests/py_unit_tests/test_parallelzone/__init__.py", line 19, in <module>
[ctest]     from parallelzone import parallelzone
[ctest]   File "/home/ryan/workspaces/NWX/ParallelZone/build/Python/parallelzone/__init__.py", line 28, in <module>
[ctest]     import cppyy.gbl.parallelzone as parallelzone
[ctest] ModuleNotFoundError: No module named 'cppyy.gbl.parallelzone'; 'cppyy.gbl' is not a package
[ctest] 
wadejong commented 2 years ago

I did not check the init. I simply was doing:

`[GCC 9.4.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import parallelzone import cppyy.gbl.parallelzone as parallelzone dir(parallelzone) ['add', 'bool', 'class', 'delattr', 'destruct', 'dict', 'dir', 'dispatch', 'doc', 'eq', 'format', 'ge', 'getattribute', 'getitem', 'gt', 'hash', 'init', 'init_subclass', 'invert', 'le', 'lt', 'module', 'mul', 'ne', 'neg', 'new', 'pos', 'python_owns', 'radd', 'reduce', '__reduce_ex', 'repr', 'rmul', 'rsub', 'rtruediv', 'setattr', 'sizeof', 'smartptr', 'str', 'sub', 'subclasshook', 'truediv', 'weakref__', 'hash_objects', 'hash_to_string', 'make_file_logger', 'make_hash', 'make_hasher', 'make_null_logger', 'make_stderr_logger', 'make_stdout_logger', 'make_stream_logger']

`

Not sure why this does not work on the init side.

ryanmrichard commented 2 years ago

I think I'm starting to understand what's going on. It appears that Cppyy does not make a parallelzone module or package at all. All the Cppyy symbols live in a global object gbl which is of type _meta and is part of the cppyy module. When you do import cppyy.gbl.parallelzone as parallelzone you're actually bringing a global object parallelzone (which is of type parallelzone_meta and is a member of cppyy.gbl) into scope. When we do that in parallelzone/__init__.py we create a package parallelzone which has a global object parallelzone which lives in it.

So @wadejong I think that you're right, there's nothing we can do at the moment to change this sytax. The current syntax is a result of Python's import semantics and the decisions Cppyy makes. However, I would argue the Cppyy decision to map a C++ namespace to a Python class doesn't make sense. Conceptually it's more of a Python module or a Python package. I don't know, but maybe it's not possible for Cppyy to automate such a mapping?

wadejong commented 2 years ago

I posed the question to Wim. Let's see what his thoughts are.