Closed NeolithEra closed 5 years ago
Fix your direct dependencies to be py4j==0.10.7 and pyspark ==2.4.3, to remove this conflict. I have checked this revision will not affect your downstream projects now.
Ask your upstream project pyspark to loose the version range of py4j to be >=0.10.7.
Which solution do you prefer, 1 or 2? Please let me know your choice. Then I can submit a PR to fix it.
@NeolithEra thanks for your feedback. About solution 1, the dependencies are too strict, and for solution 2, I'm not clear about the fix, could you explain it detailed?
Based on the doc of pyspark, the versions 2.4.x depend on the exactly py4j==0.10.7, maybe have not tested on py4j 0.10.8, not supported, or some reasons.
Actually, it's better that PySpark takes over the dependency of py4j, that means removal py4j from dependencies of PyPMML-Spark itself. Do you have any comments about this solution?
@scorebot I agree with your solution. Taking over the dependency of py4j is a good choice! I have submitted a PR following your suggestion.
I close this issue as the referred PR has been merged.
Hi, as shown in the following full dependency graph of pypmml-spark, pypmml-spark requires py4j (the latest version), while the installed version of pyspark(2.4.3) requires py4j 0.10.7.
According to Pip's “first found wins” installation strategy, py4j 0.10.7 is the actually installed version.
Although the first found package version py4j 0.10.7 just satisfies the later dependency constraint (py4j==0.10.7), it will lead to a build failure once developers release a newer version of pyspark.
Dependency tree--------
Thanks for your attention. Best, Neolith