vemonet / setup-spark

:octocat:✨ Setup Apache Spark in GitHub Action workflows
https://github.com/marketplace/actions/setup-apache-spark
MIT License
20 stars 12 forks source link

SPARK_HOME set incorrectly for versions of spark below 3 #9

Closed foster999 closed 3 years ago

foster999 commented 3 years ago

Describe the bug I'm aware that the action isn't tested for versions below 3, but wondered if you might be able to advise on how I can adjust the configuration to work.

When calling spark though pytest (pyspark), we get an error stating that spark-submit is not where it is is expected to be:

FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/work/repo/repo/..//spark/./bin/spark-submit': '/home/runner/work/repo/repo/..//spark/./bin/spark-submit

I expect that this is due to the lower spark version, but have been unable to find how the configuration of later versions differ. Any pointers here would be greatly appreaciated.

Which version of the action are you using?

v1

Environment GitHub Actions

Spark Versions

      - uses: actions/setup-python@v2
        with:
          python-version: 3.6.8

      - uses: actions/setup-java@v1
        with:
          java-version: '8'

      - uses: vemonet/setup-spark@v1
        with:
          spark-version: '2.4.0'
          hadoop-version: '2.1.1'

Run/Repo Url See run here

foster999 commented 3 years ago

I've got this running using a similar install to your src. Not sure which bit wasn't working, but shall close this 😄 Working install snippet for info:

     - name: Setup spark
        run: |
          cd ../../ &&
          wget -q -O spark.tgz https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.6.tgz &&
          tar xzvf spark.tgz &&
          rm "spark.tgz" &&
          (cd spark-2.4.0-bin-hadoop2.6 &&
          export SPARK_HOME=`pwd` &&
          export PYTHONPATH=$(ZIPS=("$SPARK_HOME"/python/lib/*.zip); IFS=:; echo "${ZIPS[*]}"):$PYTHONPATH
          )