Closed nchammas closed 6 years ago
@brkyvz - Looking at the contribution graph, it appears you are the key person on this project.
Can you chime in on the most practical way to deliver Python 3 support in Spark packages? Like I said, I can put in the time to help here, but I don't want to dive in without some indicator of support (even provisional support) from a committer.
It looks like #29, by @mariusvniekerk, will partially address this issue by preventing .pyc
files from being pulled into the distribution.
So basically a bunch of the older spark-packages that were made have pyc files in them. Newer ones are safe (provided of course that the python parts of them are py2/3 compatible)
@brkyvz - Is this still an issue, given your comments on #29? Where exactly do we stand today with regards to Python 3 support?
Pinging @brkyvz again for an authoritative word on Python 3 support in Spark packages. It looks like they now work (as of GraphFrames 0.3, at least), but I'm not sure if there was an official announcement to that effect.
cc @mengxr @thunterdb
So spark packages have supported python 3 for quite a while. It's just that many of the python parts of those packages were not py2/3 compatible.
Closing this as I believe nothing needs to be done here. GraphFrames, the project that motivated me to file this issue, runs fine on Python 3.
I'm interested in helping resolve https://github.com/graphframes/graphframes/issues/85, which would make GraphFrames compatible with Python 3. It appears that resolving that issue requires some changes here to support building packages that support Python 3.
Is anybody working on that already?
It looks like the compiled Python artifacts are being generated here using
python -m compileall
. If we want to support building Spark packages that support both Python 2 and 3, we should perhaps be building and shipping wheels instead of compiled Python bytecode.If I understood the situation correctly that we currently can't build Spark packages that support Python 3, I would consider this a critical deficiency since 1) Spark itself supports Python 3, and 2) Python 3 adoption is reaching a tipping point (at least in my circles) where most new Python projects are being written in Python 3.
I am happy to take this on with some guidance from a maintainer, or help said maintainer do the work themselves. This issue is important to me and I am ready to make time to work on it.