ninia / jep

Embed Python in Java
Other
1.3k stars 147 forks source link

"import" sometimes does not work from Java #416

Open Daniel-Alievsky opened 2 years ago

Daniel-Alievsky commented 2 years ago

I've published simple test here: https://bitbucket.org/DanielAlievsky/stare-python-experiments/src/master/ Please look at the class com.siams.stare.extensions.python.tests.SimpleJepForImport You may call it via call_SimpleJepForJepForImports.cmd, if all files were compiled by IntelliJ IDEA, or directly from IDEA. But for your convenience I've also attached reduced version of this repository in stare-python-experiments-reduced.zip , where all files are already compiled (besides jep-4.0.3.jar: you will see a path to it in call_SimpleJep.cmd)

This test adds new root, passed via the argument in call_SimpleJepForJepForImports.cmd It is jep-java-tests/src/test/python - root directory of my Python tests. Then it performs a simple sequence of attempts to import packages and modules, existing inside jep-java-tests/src/test/python folder:

        interp.exec("import net");
        System.out.println("Imported net");
        interp.exec("import net.algart");
        System.out.println("Imported net.algart");
        interp.exec("import net.algart.stare");
        System.out.println("Imported net.algart.stare.");
        interp.exec("import net.algart.stare.api");
        System.out.println("Imported net.algart.stare.api");
        interp.exec("import net.EmptyTest1");
        System.out.println("Imported net.EmptyTest1");
        interp.exec("import net.algart.EmptyTest2");
        System.out.println("Imported net.algart.EmptyTest2");

On my computer, the result is the following:

C:\TMP\stare-python-experiments-reduced>java -classpath jep-java-tests/target/test-classes;jep-java-tests/target/classes;C:\Users\Daniel.m2\repository\black\ninia\jep\4.0.3\jep-4.0.3.jar com.siams.stare.extensions.python.tests.SimpleJepForImport jep-java-tests/src/test/python Adding new root: C:\TMP\stare-python-experiments-reduced\jep-java-tests\src\test\python init package tests Imported tests.SimpleObjects ['C:\Users\Daniel\AppData\Local\Programs\Python\Python310\python310.zip', 'C:\Users\Daniel\AppData\Local\Programs\Python\Python310\DLLs', 'C:\Users\Daniel\AppData\Local\ Programs\Python\Python310\lib', 'C:\Program Files\Java\jdk-17.0.2\bin', 'C:\Users\Daniel\AppData\Local\Programs\Python\Python310', 'C:\Users\Daniel\AppData\Local\Programs\ \Python\Python310\lib\site-packages', 'C:\TMP\stare-python-experiments-reduced\jep-java-tests\src\test\python']

Imported tests.EmptyTest Imported net Imported net.algart Imported net.algart.stare. Exception in thread "main" jep.JepException: <class 'ModuleNotFoundError'>: No module named 'net.algart.stare.api' at .(:1) at jep.Jep.exec(Native Method) at jep.Jep.exec(Jep.java:339) at com.siams.stare.extensions.python.tests.SimpleJepForImport.main(SimpleJepForImport.java:35)

You see that your interpreter normally import some package and modules, for example, tests.EmptyTest (which prints sys.path). But net.algart.stare.api becomes a problem!

If I comment importing net.algart.stare.api, this test shows an error on the next line - attempt to import net.EmptyTest1, though it is an exact copy of tests.EmptyTest, which was successfully imported.

It seems implementation of your import has some bugs inside JEP core. Can you investigate and fix this? Is there any workaround? stare-python-experiments-reduced.zip

bsteffensmeier commented 2 years ago

It sounds like ClassList may not be able to find your package for some reason. You can try debugging around in ClassList try to determine why your environment doesn't work but some Java frameworks are not compatible with ClassList so it may be necessary to implement a custom ClassEnquirer

Daniel-Alievsky commented 2 years ago

Do you want to say that it is a problem of conflict between Python and Java package names? Really, net.algart.stare is my Java package (see repository).

I've little reworked my test (please check): I've created a function

    private static void tryToImport(Interpreter interp, String moduleName) {
        try {
            System.out.printf("Importing %s (isJavaPackage: %s)...%n",
                    moduleName,
                ClassList.getInstance().isJavaPackage(moduleName));
            interp.exec("import " + moduleName);
        } catch (JepException e) {
            System.out.println(e);
        }
    }

Then I call:

        tryToImport(interp, "tests.SimpleObjects");
        tryToImport(interp, "tests.EmptyTest");
        tryToImport(interp, "net");
        tryToImport(interp, "net.algart");
        tryToImport(interp, "net.algart.stare");
        tryToImport(interp, "net.algart.stare.api");
        tryToImport(interp, "net.EmptyTest1");
        tryToImport(interp, "net.algart.EmptyTest2");

Results are still strange:

Adding new root: C:\siams\computer-vision\stare-python-experiments\jep-java-tests\src\test\python Importing tests.SimpleObjects (isJavaPackage: false)... Importing tests.EmptyTest (isJavaPackage: false)... Importing net (isJavaPackage: true)... Importing net.algart (isJavaPackage: true)... Importing net.algart.stare (isJavaPackage: true)... Importing net.algart.stare.api (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'net.algart.stare.api' Importing net.EmptyTest1 (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'net.EmptyTest1' Importing net.algart.EmptyTest2 (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'net.algart.EmptyTest2'

I don't see correlation between isJavaPackage result and ability to import.

Daniel-Alievsky commented 2 years ago

However, it seems that all works fine if Python package names have no any relation to Java packages. I've created additional level of packages: "pyroot". So, my tree is now the following:

├───net
│   └───algart
│       └───stare
│           └───api
├───pyroot
│   └───net
│       └───algart
│           └───stare
│               └───api
└───tests
    └───util

And my test:

        tryToImport(interp, "tests.SimpleObjects");
        tryToImport(interp, "tests.EmptyTest");
        tryToImport(interp, "pyroot");
        tryToImport(interp, "pyroot.net.algart");
        tryToImport(interp, "pyroot.net.algart.stare");
        tryToImport(interp, "pyroot.net.algart.stare.api");
        tryToImport(interp, "pyroot.net.algart.stare.api.StareInOut");
        tryToImport(interp, "pyroot.net.EmptyTest1");
        tryToImport(interp, "pyroot.net.algart.EmptyTest2");
        tryToImport(interp, "net");
        tryToImport(interp, "net.algart");
        tryToImport(interp, "net.algart.stare");
        tryToImport(interp, "net.algart.stare.api");
        tryToImport(interp, "net.algart.stare.api.StareInOut");
        tryToImport(interp, "net.EmptyTest1");
        tryToImport(interp, "net.algart.EmptyTest2");

Problems begin only at net.algart.stare.api:

Adding new root: C:\siams\computer-vision\stare-python-experiments\jep-java-tests\src\test\python Importing tests.SimpleObjects (isJavaPackage: false)... Importing tests.EmptyTest (isJavaPackage: false)... Importing pyroot (isJavaPackage: false)... Importing pyroot.net.algart (isJavaPackage: false)... Importing pyroot.net.algart.stare (isJavaPackage: false)... Importing pyroot.net.algart.stare.api (isJavaPackage: false)... Importing pyroot.net.algart.stare.api.StareInOut (isJavaPackage: false)... Importing pyroot.net.EmptyTest1 (isJavaPackage: false)... Importing pyroot.net.algart.EmptyTest2 (isJavaPackage: false)... Importing net (isJavaPackage: true)... Importing net.algart (isJavaPackage: true)... Importing net.algart.stare (isJavaPackage: true)... Importing net.algart.stare.api (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'net.algart.stare.api' Importing net.algart.stare.api.StareInOut (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'net.algart.stare.api' Importing net.EmptyTest1 (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'net.EmptyTest1' Importing net.algart.EmptyTest2 (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'net.algart.EmptyTest2'

While we use with root package "pyroot", impossible in Java, all works fine!

So, my question: is it possible to fix this somehow, to allow me use normal package names also in Python libraries? You know that the standard "net.algart.xxxx" is important: I'm an owner of domain algart.net and can be sure, that nobody from our users will use the same package names. But we cannot say nothing about package like "pyroot".

At the same time, it is obvious that a lot of Java modules are located in package "net"; our Java code, in particular, is located in "net.algart" and "com.siams".

O'k, I have another domain, algart.org Maybe it can help? No! I renamed top Python package from "net" to "org" (currently we do not use Javasub-packages or "org") and modified the test. Results:

Importing org (isJavaPackage: true)... Importing org.algart (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'org.algart' Importing org.algart.stare (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'org.algart' Importing org.algart.stare.api (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'org.algart' Importing org.algart.stare.api.StareInOut (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'org.algart' Importing org.EmptyTest1 (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'org.EmptyTest1' Importing org.algart.EmptyTest2 (isJavaPackage: false)... jep.JepException: <class 'ModuleNotFoundError'>: No module named 'org.algart'

I think it is because a lot of standard Java libraries use "org" package.

Maybe you can resolve this problem by adding to your logic (packageToSubPackageMap) special check for domains of 1-2 high levels. Such names as "java", "javax", "net.xxxx", "org.xxxx", "com.xxxx" usually cannot be names of real Java packages, where some classes located. In Java world, all classes usually located in subpackages of domain-name package, like in our system (net.algart.stare.xxx... etc.)

Or, please, just provide an ability to control this via JepConfig. I understand that I can override setupJavaImportHook, but it seems to be not a very simple an obvious way for resolving typical problem. (It is really typical for almost any complex project combining Java + Python: we will want to provide unique namespaces in both languages.)

bsteffensmeier commented 2 years ago

A particular import from Python can only resolve to one package. With Jep you can choose if this should be a python package or a java package by supplying a ClassEnquirer in your JepConfig. The class enquirer is used by Jep to decide what java pacakges should be imported into python. If you do not what the "net" package to be imported into Python from Java, simply provide a ClassEnquirer which returns false whenever it isJavaPackage() is called with "net" as an argument. Python will then search in other locations for an alternative implementation of the "net" package.

While there are occasional conflicts between Java and Python package names, they aren't very common because Python Naming Convention advises against using long java style package names. You should consider renaming your python module to follow the recommendations of PEP-423.

Daniel-Alievsky commented 2 years ago

Ok, I understand that Java-like package naming is not a tradition in Python world. It is also discussed here: https://stackoverflow.com/questions/2713874/python-package-name-conventions

However I think that something goes wrong in your heuristics algorithm or in the base idea of Java importing. A) In my test, "net", "net.algart", "net.algart.stare" are existing Java packages, and isJavaPackage returns true. They are "impored" normally! B) "net.algart.stare.api"is NOT existing Java package, isJavaPackage returns false. Also correct. But what is the problem to import EXISTING Python package instead?

Maybe it is because your interpret Java packages in Python style? In Python, a package is an analog of module: it is a real essence, that can be imported or no. In Java, a package is just a form of organizing classes. There is absolutely no sense to "import" java.util package, we can import only some classes from java.util.

My repository does not contain any classes inside "net", "net.algart" and "net.algart.stare", and it does not contain "net.algart.stare.api" package at all. So, why is it prohibited to declare some Python package like "net.algart.stare.api" or (more suitable solution) "net.algart.python.api"? Is is possible to rework you ClassList to allow such situations?

It seems that now the only safe solution (excepting special tricks with custom ClassEnquirer) is to place all Python packages into the root package, which cannot appear as a root Java package. For example, I used "stare" root package, and all works well. But is it a good, stable solution for the future? New root domains appear from time to time, like ".info", ".biz", even ".cafe". What are guarantees that someone will not give the root Python package some of these names?

leojava2001 commented 2 years ago

I also encounter a problem importing some modules in x.y format. For example, I can import pyarmor.project - no problem but cannot import usb.core : fails jep.JepException: <class 'ModuleNotFoundError'>: No module named 'usb.core" Both pyarmor and usb are in the site-packages directory that is added to the sys.path The import is working fine in Windows python console but it does not work with JEP. (I am using currently in NetBeans) This is a show stopper because some external modules need to import usb.core and some other similar modules... I checked that there is no java package usb.core (using ClassList) Where exactly in JEP the search for the module is done?
I wonder if there is a way to fix this behavior...

bsteffensmeier commented 2 years ago

I checked that there is no java package usb.core (using ClassList)

Is there any package in java that starts with usb.? If there is any package that starts with usb then the java import hook will be responsible for all subpackages under usb and cannot mix java and python packages. You can disable java imports of usb by implementing a custom ClassEnquirerer(or extending ClassList).

Where exactly in JEP the search for the module is done?

All of the jep logic related to imports is handled by the java import hook which runs most calls through the ClassEnquirer(ClassList by default). If you aren't seeing any usb. related imports in the ClassList then the import is handled by built-in python import mechanisms and you should start looking at why python can't find it(Are you loading the correct python install with the correct path)

leojava2001 commented 2 years ago

Where exactly in JEP the search for the module is done? There is no usb package in java and I have no problem importing usb in jep. It is only when I start using usb.core - I see exception I have no problem importing usb.core using Python console directly.
I did check using ClassList ClassList.getInstance().isJavaPackage("usb"); - returns true ClassList.getInstance().isJavaPackage("usb.core");- returns false

So, are you saying that I need to write an extension of the ClassList?

Here is a code snippet


SharedInterpreter m_interp=new SharedInterpreter(); m_interp.exec("import sys"); m_interp.exec("sys.path.insert(0, 'C:/Python38/Lib/site-packages')"); m_interp.exec("import usb"); - //no problem m_interp.exec("from usb import *");//no problem // problem here m_interp.exec("import usb.core");


bsteffensmeier commented 2 years ago

In my jep installation I cannot import usb and ClassList does not indicate it is a java package. There is something particular about your environment that is causing the ClassList to decide you have a java package named usb. If there is no usb package in Java then it would be helpful if you can debug the ClassList creation to determine why it considers the usb package as a Java package. We may need to put in a fix if it is finding java packages that aren't there.

So, are you saying that I need to write an extension of the ClassList?

Yes, if you override isJavaPackage() so it returns false when given usb then the usb package will be handled by python which should fix your problem. You can pass in a custom ClassEnquirer as part of your JepConfig

leojava2001 commented 2 years ago

Yes, clear. There is usb package in one of the jars... However, I don't think one can extend ClassList - it is a Singleton i.e. constructor is private It is easier to just copy ClassList and write a new implementation of the ClassInquirer... Thank you.

Daniel-Alievsky commented 2 years ago

It seems that reasons of the problem are the same as in my case. In large Java application nobody knows about all packages, that are probably used in one of thousands modules, added by maven dependencies. So, intersections with Python package are probable. At the same time, with 99% probability we don't need to use these Java packages from Python.

As I understand, the root of the problem is in the difference between understanding "subpackage" in Python and Java. In Java, there is no any connection between package and subpackage: packages net.algart and net.algart.stare has no anything common. I may declare and use "net.algart" in one of my modules (JARs) and "net.algart.stare" in another, and they will work together. In Python, if I declare "net.algart" in one of my roots (paths), I cannot use "net.algart.stare" in the second path: Python will use only subpackages of "net.algart" from the first path.

Maybe you will provide some additional features in the behavior of ClassList, to avoid necessity of creation our own ClassList? For example, it seems to be a good idea just to ignore all top-level packages (like "org.xxxx") in your hook. Then, if someone use Java module "org.xxxx.yyyy.zzzz.SomeClasss", JEP will not block using Python modules "org.xxxx" or "org.xxxx.tttt". Am I right?

Yes, in this case, Python will not be able to use classes, placed directly in top-level package (org.xxxx.MyClass), but it is very rare case, excluding some standard Java packages. And, in any case, you can add boolean flag, disabling this behavior.

leojava2001 commented 2 years ago

I think in many cases, JEP is used in one direction Java-> python or Python-> Java - not both in the same program. A nice and simple option would be to add configuration setting indicating that only Java-> Python is used. This will allow to bypass any checks for Java classes.

Daniel-Alievsky commented 2 years ago

I think in many cases, JEP is used in one direction Java-> python or Python-> Java - not both in the same program.

Аllow me to disagree. If we decided to use Python from Java, it means that we have some user environment, created in Java, with ability to execute some functions, solve some tasks etc. It usually contains some interaction with executed functions, like ability to stop long calculations, show progress bar, inform about exceptions, get work directories, where results should be stored, etc. If we call Python function for some isolated task, it can require such abilities, and here it will need accessing to some Java classes. Moreover, it is very possible that a complex Python module will use some other functions, already existing in your system, but written in Java. If you have no such necessity in the very beginning, it does not mean that it will not be needed in future.

If the main program is written in Python, situation is analogous - we probably will need to access some elements of Python environment from Java.

bsteffensmeier commented 2 years ago

However, I don't think one can extend ClassList - it is a Singleton i.e. constructor is private

Sorry about that, it has been awhile since I looked at it and I didn't realize it is private. I think we are hesitant to make it public because then we need to ensure backwards compatibility for subclasses and because all the lookups it does during construction can be time consuming so we don't necessarily want to do it often. Copying ClassList should work fine, however it may be easier to maintain if you provide a custom wrapper around ClassList that alters the dfunctionality to suit you. Something like this:

public class LimitedEnquirer implements ClassEnquirer {

    private final ClassEnquirer delegate = ClassList.getInstance();

    public boolean isJavaPackage(String name) {
        if ("usb".equals(name)) {
            return false;
        }
        return delegate.isJavaPackage(name);
    }

    public String[] getClassNames(String pkgName){
        return delegate.getClassNames(pkgName);
    }

    public String[] getSubPackages(String pkgName);
        return delegate.getSubPackages(pkgName);
    }
}

I think in many cases, JEP is used in one direction Java-> python or Python-> Java - not both in the same program. A nice and simple option would be to add configuration setting indicating that only Java-> Python is used. This will allow to bypass any checks for Java classes.

That is an interesting idea, you should be able to achieve that with an empty ClassEnquirer, you can use the NamingConventionClassEnquirer to achieve that result if you choose not to include the default packages and never add any top level package names.

If we added the option to JepConfig to skip the Java Imports we could completely skip setting up the java importer which would save the calls into java for imports. I also think it would be an easy way for people to troubleshoot these types of problems. Since a null ClassEnquirer in JepConfig is already documented as using ClassList I think we should add a new method to disable it. Maybe unsetClassEnquirer(), 'removeClassEnquirer()ordisableJavaImports()`. @ndjensen Do you see any problems with an option in JepConfig to disable the java import hook? If you don't see any problems, do you have any opinions on how to clearly name it?