Open ryanmrichard opened 2 years ago
This must be different from from nwchemex import *
. Here, I can simply do mm = pluginplay.ModuleManager()
See [https://www.codingem.com/python-difference-between-import-and-from-import/] for the difference between them.
The issue is referring to the nwchemex.nwchemex.load_modules(...)
vs. nwchemex.load_modules(...)
syntax. We shouldn't have to do from nwchemex import nwchemx
(or from nwchem import *
) to get the nwchemex.load_modules(...)
syntax.
Agreed. Let me investigate.
Same solution as #34 .
import nwchemex
imports the package including the nwchemex class and
dir(nwchemex)
shows what is loaded
'dir(nwchemex.nwchemex)` shows what functions are available
from nwchemex import nwchemex
imports the classes in nwchemex class
from nwchemex import chemist
imports the classes in chemist
There is not much we can do about this, this is how the Python structure works.
from nwchemex import nwchemex imports the classes in nwchemex class
FTR, there is no nwchemex
class.
To reiterate, the problem here is our package setup not how Python works.
I expect:
import nwchemex
nwchemex.load_modules()
to work. I should not have to do:
from nwchemex import nwchemex
nwchemex.load_moddules()
The former code snippet imports the nwchemex
package, the latter imports the nwchemex.nwchemex
package (which is a subpackage of the nwchemex
package). If it helps to think of it in C++ terms, the structure we are mirroring looks like nwchemex::load_modules();
not nwchemex::nwchemex::load_modules()
.
You are correct, I misspoke.
import nwchemex
imports the package namespace that includes the nwchemex namespace
from nwchemex import nwchemex
imports the classes in nwchemex namespace
You are correct, I misspoke.
To be clear are you saying that you agree that it shouldn't be from nwchemex import nwchemex
?
It's undesirable, but this is a Python feature. To do what you want we need to tell cppyy to ignore the nwchemex namespace (for example).
End of the day, this syntax is not normal. Users do:
import numpy
import math
my_array = numpy.ndarray()
two = math.sqrt(4)
not:
from numpy import numpy
from math import math
my_array = numpy.ndarray()
two = math.sqrt(4)
As for resolving this issue, if we have to provide a wrapper to get the syntax to be from nwchemex import load_modules
that's fine with me, but the user should not have to do from nwchemex import nwchemex
it's non-intuitive, it doesn't mirror the C++, and it looks weird.
Will see if there is a way around this. For reference: https://realpython.com/python-modules-packages/
What is your issue with from nwchemex import *
???
from nwchemex import *
mm = pluginplay.ModuleManager()
nwchemex.load_modules(mm)
mol = chemist.Molecule()
With this all of it works, and you can directly access all the modules. There is no way to effectively remove the extra namespace. And, adding a wrapper to initialize only adds another level of namespace.
First from x import *
is in general frowned upon; for various reasons (notably performance) modern Python pushes IWYU (include what you use). Second, in our present case I think that's literally the same as from nwchemex import nwchemex
which we agreed is non ideal.
FWIW, I'm pretty sure this entire issue can be resolved with a one line modification to the __init__.py
file.
Actually, I see where @ryanmrichard was going with the init.py discussion. I think I overlooked what I think he is referring to.
Bert
I was wrong. I don't think this can be fixed in one line. There's more going on here than I appreciated. I thought we could do something like:
from cppy.gbl.parallelzone import *
or (since cppy.gbl
isn't a package)
from cppy.gbl import parallelzone
from parallelzone import *
to strip off the extra parallelzone
.
However, while testing my solution I now see that when you do from parallelzone import parallelzone
you're actually importing a class called parallelzone_meta
that lives in the parallelzone
package. The members of that class are the contents of the parallelzone package. This is very unexpected behavior to me.
I'm incredibly confused by what's going on because I don't understand how import parallelzone
imports a class called parallelzone_meta
I would have expected the line to be:
from parallelzone import parallelzone_meta
This seems to work:
import cppyy.gbl.parallelzone as parallelzone
Working off of https://github.com/NWChemEx-Project/ParallelZone/issues/75, your solution doesn't work for me:
[ctest] ImportError: Failed to import test module: test_parallelzone
[ctest] Traceback (most recent call last):
[ctest] File "/usr/lib/python3.10/unittest/loader.py", line 470, in _find_test_path
[ctest] package = self._get_module_from_name(name)
[ctest] File "/usr/lib/python3.10/unittest/loader.py", line 377, in _get_module_from_name
[ctest] __import__(name)
[ctest] File "/home/ryan/workspaces/NWX/ParallelZone/tests/py_unit_tests/test_parallelzone/__init__.py", line 19, in <module>
[ctest] from parallelzone import parallelzone
[ctest] File "/home/ryan/workspaces/NWX/ParallelZone/build/Python/parallelzone/__init__.py", line 28, in <module>
[ctest] import cppyy.gbl.parallelzone as parallelzone
[ctest] ModuleNotFoundError: No module named 'cppyy.gbl.parallelzone'; 'cppyy.gbl' is not a package
[ctest]
I did not check the init. I simply was doing:
`[GCC 9.4.0] on linux Type "help", "copyright", "credits" or "license" for more information.
import parallelzone import cppyy.gbl.parallelzone as parallelzone dir(parallelzone) ['add', 'bool', 'class', 'delattr', 'destruct', 'dict', 'dir', 'dispatch', 'doc', 'eq', 'format', 'ge', 'getattribute', 'getitem', 'gt', 'hash', 'init', 'init_subclass', 'invert', 'le', 'lt', 'module', 'mul', 'ne', 'neg', 'new', 'pos', 'python_owns', 'radd', 'reduce', '__reduce_ex', 'repr', 'rmul', 'rsub', 'rtruediv', 'setattr', 'sizeof', 'smartptr', 'str', 'sub', 'subclasshook', 'truediv', 'weakref__', 'hash_objects', 'hash_to_string', 'make_file_logger', 'make_hash', 'make_hasher', 'make_null_logger', 'make_stderr_logger', 'make_stdout_logger', 'make_stream_logger']
`
Not sure why this does not work on the init side.
I think I'm starting to understand what's going on. It appears that Cppyy does not make a parallelzone module or package at all. All the Cppyy symbols live in a global object gbl
which is of type _meta
and is part of the cppyy
module. When you do import cppyy.gbl.parallelzone as parallelzone
you're actually bringing a global object parallelzone
(which is of type parallelzone_meta
and is a member of cppyy.gbl
) into scope. When we do that in parallelzone/__init__.py
we create a package parallelzone
which has a global object parallelzone
which lives in it.
So @wadejong I think that you're right, there's nothing we can do at the moment to change this sytax. The current syntax is a result of Python's import semantics and the decisions Cppyy makes. However, I would argue the Cppyy decision to map a C++ namespace to a Python class doesn't make sense. Conceptually it's more of a Python module or a Python package. I don't know, but maybe it's not possible for Cppyy to automate such a mapping?
I posed the question to Wim. Let's see what his thoughts are.
Right now our Python API looks like:
Namely each top-level Python module keeps its functionality in a submodule of the same name.
I think it makes more sense for the API to be:
I assume this is an either an issue with the generated
__init__.py
file or the CMake module for creating the Python bindings.