autodeployai / pypmml-spark

Python PMML scoring library for PySpark as SparkML Transformer
Apache License 2.0
21 stars 2 forks source link

Potential dependency conflicts between pypmml-spark and py4j #2

Closed NeolithEra closed 5 years ago

NeolithEra commented 5 years ago

Hi, as shown in the following full dependency graph of pypmml-spark, pypmml-spark requires py4j (the latest version), while the installed version of pyspark(2.4.3) requires py4j 0.10.7.

According to Pip's “first found wins” installation strategy, py4j 0.10.7 is the actually installed version.

Although the first found package version py4j 0.10.7 just satisfies the later dependency constraint (py4j==0.10.7), it will lead to a build failure once developers release a newer version of pyspark.

Dependency tree--------

pypmml-spark<version range:>
| +-py4j<version range:>
| +-pyspark<version range:>=2.4.0>
| | +-py4j<version range:==0.10.7>

Thanks for your attention. Best, Neolith

NeolithEra commented 5 years ago

Solution

  1. Fix your direct dependencies to be py4j==0.10.7 and pyspark ==2.4.3, to remove this conflict. I have checked this revision will not affect your downstream projects now.

  2. Ask your upstream project pyspark to loose the version range of py4j to be >=0.10.7.

Which solution do you prefer, 1 or 2? Please let me know your choice. Then I can submit a PR to fix it.

scorebot commented 5 years ago

@NeolithEra thanks for your feedback. About solution 1, the dependencies are too strict, and for solution 2, I'm not clear about the fix, could you explain it detailed?

Based on the doc of pyspark, the versions 2.4.x depend on the exactly py4j==0.10.7, maybe have not tested on py4j 0.10.8, not supported, or some reasons.

Actually, it's better that PySpark takes over the dependency of py4j, that means removal py4j from dependencies of PyPMML-Spark itself. Do you have any comments about this solution?

NeolithEra commented 5 years ago

@scorebot I agree with your solution. Taking over the dependency of py4j is a good choice! I have submitted a PR following your suggestion.

scorebot commented 5 years ago

I close this issue as the referred PR has been merged.