Closed afalquina closed 3 years ago
I see nothing obviously wrong with the line. The thing is that error trace shows it is not in JPype code by in Python bootloader. So my guess is that you have something interfering with the loading process (such as a directory or module named "org" in the Python path).
I would proceed by using JClass to perform the class load instead of the import. If it works then the issue is something in the Python loading system. I would start that debugging by just importing "org" and see if it has a "file" attribute so that I can see where it is coming from. Repeat the process for org.apache and so forth. You can can also add a few "print" statements to the jpype/imports.py to figure out the difference in the path that was taken up to that import statement.
I'll try that. What baffles me, though, is that it works with OpenJDK 8. Just changing to OpenJDK 14 triggers the error.
OK. I have tried the following. First with OpenJDK 14:
$ python
Python 3.8.2 (default, Jul 16 2020, 14:00:26)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import jpype
>>> import jpype.imports
>>> jpype.startJVM(jpype.getDefaultJVMPath())
>>> import org
>>> dir(org)
['apache', 'graalvm', 'ietf', 'jcp', 'w3c', 'xml']
>>> import org.apache
>>> dir(org.apache)
['lucene']
>>> import org.apache.lucene
>>> dir(org.apache.lucene)
['analysis', 'codecs', 'document', 'index', 'search', 'store', 'util']
>>> import org.apache.lucene.search
>>> dir(org.apache.lucene.search)
['TopFieldCollector']
And then with OpenJDK 8:
$ python
Python 3.8.2 (default, Jul 16 2020, 14:00:26)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import jpype
>>> import jpype.imports
>>> jpype.startJVM(jpype.getDefaultJVMPath())
>>> import org
>>> dir(org)
['apache', 'classpath', 'ietf', 'jcp', 'omg', 'w3c', 'xml']
>>> import org.apache
>>> dir(org.apache)
['lucene']
>>> import org.apache.lucene
>>> dir(org.apache.lucene)
['LucenePackage', 'analysis', 'codecs', 'document', 'geo', 'index', 'search', 'store', 'util']
>>> import org.apache.lucene.search
>>> dir(org.apache.lucene.search)
['AutomatonQuery', 'BlendedTermQuery', 'BlockMaxDISI', 'BooleanClause', 'BooleanQuery', 'BoostAttribute', 'BoostAttributeImpl', 'BoostQuery', 'BulkScorer', 'CachingCollector', 'CollectionStatistics', 'CollectionTerminatedException', 'Collector', 'CollectorManager', 'ConjunctionDISI', 'ConstantScoreQuery', 'ConstantScoreScorer', 'ConstantScoreWeight', 'ControlledRealTimeReopenThread', 'DisiPriorityQueue', 'DisiWrapper', 'DisjunctionDISIApproximation', 'DisjunctionMaxQuery', 'DocIdSet', 'DocIdSetIterator', 'DocValuesFieldExistsQuery', 'DocValuesRewriteMethod', 'DoubleValues', 'DoubleValuesSource', 'Explanation', 'FieldComparator', 'FieldComparatorSource', 'FieldDoc', 'FieldValueHitQueue', 'FilterCollector', 'FilterLeafCollector', 'FilterMatchesIterator', 'FilterScorable', 'FilterScorer', 'FilterWeight', 'FilteredDocIdSetIterator', 'FuzzyQuery', 'FuzzyTermsEnum', 'ImpactsDISI', 'IndexOrDocValuesQuery', 'IndexSearcher', 'LRUQueryCache', 'LeafCollector', 'LeafFieldComparator', 'LeafSimScorer', 'LiveFieldValues', 'LongValues', 'LongValuesSource', 'MatchAllDocsQuery', 'MatchNoDocsQuery', 'Matches', 'MatchesIterator', 'MatchesUtils', 'MaxNonCompetitiveBoostAttribute', 'MaxNonCompetitiveBoostAttributeImpl', 'MultiCollector', 'MultiCollectorManager', 'MultiPhraseQuery', 'MultiTermQuery', 'NGramPhraseQuery', 'NamedMatches', 'NormsFieldExistsQuery', 'PhraseQuery', 'PointInSetQuery', 'PointRangeQuery', 'PositiveScoresOnlyCollector', 'PrefixQuery', 'Query', 'QueryCache', 'QueryCachingPolicy', 'QueryRescorer', 'QueryVisitor', 'ReferenceManager', 'RegexpQuery', 'Rescorer', 'Scorable', 'ScoreCachingWrappingScorer', 'ScoreDoc', 'ScoreMode', 'Scorer', 'ScorerSupplier', 'ScoringRewrite', 'SearcherFactory', 'SearcherLifetimeManager', 'SearcherManager', 'SegmentCacheable', 'SimpleCollector', 'SimpleFieldComparator', 'Sort', 'SortField', 'SortRescorer', 'SortedNumericSelector', 'SortedNumericSortField', 'SortedSetSelector', 'SortedSetSortField', 'SynonymQuery', 'TermInSetQuery', 'TermQuery', 'TermRangeQuery', 'TermStatistics', 'TimeLimitingCollector', 'TopDocs', 'TopDocsCollector', 'TopFieldCollector', 'TopFieldDocs', 'TopScoreDocCollector', 'TopTermsRewrite', 'TotalHitCountCollector', 'TotalHits', 'TwoPhaseIterator', 'UsageTrackingQueryCachingPolicy', 'Weight', 'WildcardQuery', 'similarities', 'spans']
For some reason, the import finds less on the newer JVM.
What can I do to investigate this further?
That gives me a very good start. It the org.apache.lucene jar file publicly available (would I be able to replicate this myself)? The problem is likely in org.jpype.pkg.PackageManager which is responsibly for getting the list of packages. It was tested on open JDK from 8 to 11 and has had not issues, but if there was a change in Java or if something is going wrong in the code (exception or the like) then I could see the behavior you describe happen.
The next step for you would be to see if you can load using JClass instead. If you can't do that then the problem could be a Class initializer problem rather then the import system. So knowing which side of the equation to look on will help.
I have one other idea. JPype only declares something as viewable if it a public class and it uses the byte code to figure that out. If there is a change in the byte code the my routine does not handle I could see a fail. In the infinite wisdom of the original Jar format you have to part through 100 fields to get the public flag.
The jar is available here. The file contains several jars. You'll need lucene-core-8.6.0.jar.
Is this what you meant when you said “use JClass“?
$ python
Python 3.8.2 (default, Jul 16 2020, 14:00:26)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import jpype
>>> print(jpype.getDefaultJVMPath())
/usr/lib/jvm/java-14-openjdk-amd64/lib/server/libjvm.so
>>> jpype.startJVM(jpype.getDefaultJVMPath())
>>> BooleanClause = jpype.JClass("org.apache.lucene.search.BooleanClause")
>>> BooleanClause
<java class 'org.apache.lucene.search.BooleanClause'>
I am using the same jar on both JVM 8 and JVM 11/14, so I guess that the byte code is always the same. Can the byte code API have changed between JVMs?
BooleanClause = jpype.JClass('org.apache.lucene.search.BooleanClause')
There are jars that can have different byte code by jvm if the developers want to have additional features in the jar for later versions. But it is pretty rare.
Well, the JClass code works on both JVM 8 and JVM 14. At least it does not throw any exceptions…
Enviado desde mi iPhone
El 7 ago 2020, a las 22:19, Karl Nelson notifications@github.com escribió:
BooleanClause = jpype.JClass('org.apache.lucene.search. BooleanClause')
There are jars that can have different byte code by jvm if the developers want to have additional features in the jar for later versions. But it is pretty rare.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/jpype-project/jpype/issues/838#issuecomment-670696791, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQRGBII4EHTMN455WOGG4QLR7ROWJANCNFSM4PX3F5LA.
Still waiting on my development machine. I have not forgotten.
Thanks! Is there anything I can do on my side?
I believe another post on Stackoverflow found this has something to do with the length of the import. So something must be chopping the import string. I will try to run this to ground when I can replicate it.
Thanks for the update. As always, is there anything I can do to help?
I investigated this bug. It doesn't seem very satisfying. The jar file requested is a mult-version jar with both Java 8 and Java 9 layers.
Unfortunately, when I request the directory on Java 9 it is only giving me the contents of the Java 9 layer and not the Java 8 layer. As the class specified only exists in the Java 8 layer the requested class is missing. JPype then tried to throw an exception by calling Java forname. Only when it do so rather than getting an error instead Java is giving the class from the Java 8 layer. This is causing the import system to panic resulting in the incorrect error report.
The bug is not really in JPype as it is calling getResources just as it should to get a directory of the contents. It is the JVM implementation that is incorrectly giving me an empty content. This is similar to the issue with a obfuscated jar where the directories were missing entirely.
So how do we go about addressing this issue? At the time we find the class it is already too late as we were given a chance to produce a member before find_spec was called. Thus the only way to resolve it would be to try a forname when we do a get property and see if that resolves. Unfortunately that will only work if the package structure is only one level deep.
Oddly when I search for "/org/apache" it does the right thing and returns back two directories. So I need to investigate further.
777 Tue Jul 07 12:46:30 PDT 2020 org/apache/lucene/search/spans/SpanWeight$TermMatch.class
8993 Tue Jul 07 12:46:30 PDT 2020 org/apache/lucene/search/spans/SpanWeight.class
2391 Tue Jul 07 12:46:30 PDT 2020 org/apache/lucene/search/spans/SpanWithinQuery$SpanWithinWeight$1.class
3261 Tue Jul 07 12:46:30 PDT 2020 org/apache/lucene/search/spans/SpanWithinQuery$SpanWithinWeight.class
3102 Tue Jul 07 12:46:30 PDT 2020 org/apache/lucene/search/spans/SpanWithinQuery.class
1851 Tue Jul 07 12:46:30 PDT 2020 org/apache/lucene/search/spans/Spans.class
4004 Tue Jul 07 12:46:30 PDT 2020 org/apache/lucene/search/spans/TermSpans.class
136 Tue Jul 07 12:46:30 P
DT 2020 org/apache/lucene/search/spans/package-info.class
0 Tue Jul 07 12:46:32 PDT 2020 META-INF/versions/9/org/apache/lucene/search/
1455 Tue Jul 07 12:46:32 PDT 2020 META-INF/versions/9/org/apache/lucene/search/BooleanScorer$TailPriorityQueue.class
3432 Tue Jul 07 12:46:32 PDT 2020 META-INF/versions/9/org/apache/lucene/search/PointInSetQuery$SinglePointVisitor.class
6931 Tue Jul 07 12:46:32 PDT 2020 META-INF/versions/9/org/apache/lucene/search/PointRangeQuery$1.class
14775 Tue Jul 07 12:46:32 PDT 2020 META-INF/versions/9/org/apache/lucene/search/TopFieldCollector.class
Okay I believe I found a workaround that will fix this behavior on versions going forward. The bug was absolutely obnoxious as there is nothing that would indicate that MRJAR files would do something like this. I looked into this several times but reading the code and doc gave me no clues, but your example eventually lead me to unpack the jar file showing me that the directory entries are being misreported by Java.
Thanks again for the bug report and sorry it took so long to find a resolution.
Thank you!
I am using Lucene 8.6.0 on a project of mine. I am using Python 3.8.2 on Pop! OS (Ubuntu) and Python 3.8.5 on RHEL 7.8.
The following code fails on OpenJDK 11 and OpenJDK 14 but works just fine on OpenJDK 8:
Am i doing something wrong?