conda-forge / pyspark-feedstock

A conda-smithy repository for pyspark.
BSD 3-Clause "New" or "Revised" License
4 stars 11 forks source link

Update recipe, noarch, re-render #16

Closed dbast closed 5 years ago

dbast commented 5 years ago

Using conda-skeleton pypi pyspark --extra-specs pypandoc --version 2.3.2 --all-extras to fully update the recipe.

This also adds pyarrow as runtime requirement. The integration of Apache Arrow is one of the great Spark 2.3 features for python.

Noarch is possible now with the pypi package. Tried also with the existing direct download from the Apache mirror, which did not work.

Checklist

conda-forge-linter commented 5 years ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

dbast commented 5 years ago

Hm... why does CircleCI not get triggered and builds this (as it does in my forked repo)...

dbast commented 5 years ago

Hah... successfully built by CircleCI :)

dbast commented 5 years ago

Tested this locally... pyspark shell is working... compared old vs noarch tarball ... looks good.

Would merge this soon. Any comments welcome :)

dbast commented 5 years ago

Hm... this needs to be double checked ...the noarch checklist in https://github.com/conda-forge/pyspark-feedstock/pull/11 asks for Scripts argument in setup.py is not used... but pysparks setup.py seems to heavily use it... That could be a problem for windows... Something to test or to just revert the noarch commit.

dbast commented 5 years ago

Build a noarch package on osx and tested it on windows... the scripts seem to work... so this looks promising ... A noarch package would be quite beneficial as currently every upload = 9 CI builds = 200MB x 9 = 1.8GB.

pyarrow should probably stay optional ...

To not delay spark 2.4.0 any further, merging first https://github.com/conda-forge/pyspark-feedstock/pull/14 .. After that this can be reworked.

dbast commented 5 years ago

Closing this in favor of https://github.com/conda-forge/pyspark-feedstock/pull/18