Closed jpolchlo closed 3 years ago
@jpolchlo The whl
file is named based on the contents of version.py
Also, I'd encourage some more digging in your target directory pyrasterframes/target/python/dist
. I have seen sometimes there is a slightly subtle duplicated message from SBT that says there is a whl file but is not pip installable.... Example:
$ sbt ";clean;package"
....
[info] Python .whl file written to '/Users/jbrown/src/raster-frames/pyrasterframes/target/python/dist/pyrasterframes-0.8.6.dev0-py3-none-any.whl'
[info] Maven Python artifact written to '/Users/jbrown/src/raster-frames/pyrasterframes/target/scala-2.11/pyrasterframes-0.8.6-SNAPSHOT-py3-none-any.whl'
[success] Total time: 133 s (02:13), completed Feb 11, 2020 11:09:58 AM
In this case the first message is a well formed pip installable whl
, note the target directory is in the pyrasterframes package under the python
language dir. The second message is actually a whl
I think but because it is in the scala
language dir it uses the java version name...
OK, that's fine. But in that case be aware that pySparkCmd
reports the wrong .whl
:
PYTHONSTARTUP=/tmp/sbt_eb091e00/pyrf_init.py pyspark --jars /home/jpolchlopek/work/rasterframes/pyrasterframes/target/scala-2.11/pyrasterframes-assembly-0.8.5-SNAPSHOT.jar --py-files /home/jpolchlopek/work/rasterframes/pyrasterframes/target/scala-2.11/pyrasterframes-0.8.5-SNAPSHOT-py3-none-any.whl
That's where I pulled the artifact name for installation (which, puzzlingly works just fine on my local machine, but not on EMR.
Interesting and that is an issue. I think we have two things to fix:
1) Update the sbt pySparkCmd
to emit the name of the whl
in the pyrasterframes/target/python
dir
2) Possibly? Update the sbt package
messages to clarify the Python whl is suitable for pip install and clarify the purpose of the "Maven Python artifact". I would actually prefer to omit the logging about the Maven artifact because I don't know what it is for.
As far as EMR it may be down to the pip version being used? pip --version
?
I agree. It's especially confusing to publish two wheel artifacts at all, especially when one is prone to failure. You have my :+1: !
[hadoop@ip-172-31-49-55 tmp]$ pip --version
pip 20.0.2 from /usr/local/lib/python3.6/site-packages/pip (python 3.6)
This might motivate me to finally fix the sbt/setuptools version synchronization problem. It's been a bugaboo for a while.
@jpolchlo I've made an initial stab at addressing this here: https://github.com/locationtech/rasterframes/pull/480
However, I can't seem to get the output of pySparkCmd
bit to work with --py-files
, and have no idea why it's not working. No errors, just no pyrasterframes
in the sys.path
On some platforms (observed this on EMR), the naming scheme of snapshot wheels causes an error during
pip install
:After some digging, it appears that the
snapshot
tag is being recognized as part of the platform (snapshot-py3
). Might want to consider not using a hyphen, even though this is fairly standard practice in Java land.