Closed US579 closed 3 years ago
Piggy backing off of this, this fails the GitHub Action silently. I blamed a later stage for hours until I realised it was this stage where the pipeline breaks.
Thanks for the feedback and pull request! I will test this and merge asap
I am taking a look at setup-python
to see how they handle those things, here are some notes for later:
@actions/core
for the logging: core.info("hello");
or core.error("it failed");
https://github.com/actions/setup-python/blob/main/src/setup-python.ts#L23
/setup.sh
: https://github.com/actions/setup-python/blob/main/src/install-python.ts#L54The error seems to be due to a change in the GitHub Action runner
Here is the output (from @joekendal pull request workflow run):
find: ‘./systemd-private-c3fa4d47aaf94428a2881005ca8b7135-chrony.service-Laa9og’: Permission denied
find: ‘./snap.lxd’: Permission denied
find: ‘./systemd-private-c3fa4d47aaf94428a2881005ca8b7135-systemd-logind.service-ULx3Pi’: Permission denied
find: ‘./systemd-private-c3fa4d47aaf94428a2881005ca8b7135-haveged.service-P8wGPf’: Permission denied
find: ‘./systemd-private-c3fa4d47aaf94428a2881005ca8b7135-systemd-resolved.service-ivgCTf’: Permission denied
This seems to be related to permissions, as usual when Linux fails!
The issue happens when the bash lines are executed to install Spark in /usr/local
:
sudo apt-get update &&
cd /tmp &&
find -type f -printf %T+\\t%p\\n | sort -n &&
wget -q $(wget -qO- https://www.apache.org/dyn/closer.lua/spark/spark-${sparkVersion}/spark-${sparkVersion}-bin-hadoop${hadoopVersion}.tgz?as_json | python -c "import sys, json; content=json.load(sys.stdin); print(content['preferred']+content['path_info'])") &&
echo "${sparkChecksum} *spark-${sparkVersion}-bin-hadoop${hadoopVersion}.tgz" | sha512sum -c - && \
sudo tar xzf "spark-${sparkVersion}-bin-hadoop${hadoopVersion}.tgz" -C /usr/local &&
rm "spark-${sparkVersion}-bin-hadoop${hadoopVersion}.tgz" &&
sudo ln -s "/usr/local/spark-${sparkVersion}-bin-hadoop${hadoopVersion}" /usr/local/spark &&
sudo chown -R $(id -u):$(id -g) /usr/local/spark*`
The install script could be cleaned up: do we really need to run apt update
? Can we install spark in a place where it will not require sudo
?
The GitHub docs about sudo
does not say anything new since last time : https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#administrative-privileges-of-github-hosted-runners
The Linux and macOS virtual machines both run using passwordless sudo. When you need to execute commands or install tools that require more privileges than the current user, you can use sudo without needing to provide a password.
I will take a look at this when I will have more time in the coming weeks
Feel free to propose changes or better way to perform some parts of installations
@US579 I fixed the issue and updated v1
, it was due to the 3.0.1
version not available anymore for download, the best way to find out which versions are available is here: https://spark.apache.org/downloads.html
I replaced it by 3.0.2 by default, and testing also for 3.1.1. I removed the checksum option to make it easier to change version
Sorry @philMarius for the confusion! Thankfully, @joekendal pull request allowed to fix the error output, and it will now properly show an error and fail the workflow at the setup-spark
step (with invitation to check the available Spark versions at their official download URL in the error logs). I also added a link to the official Spark download page in the readme
Thanks a lot for your feedback, it should run more smoothly now!
@vemonet I just did it from the home directory instead:
- name: Setup Spark
run: |
cd /tmp
wget -q https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-1.0/spark-2.4.3-bin-hadoop2.8.tgz
tar xzf spark-2.4.3-bin-hadoop2.8.tgz -C /home/runner
rm spark-2.4.3-bin-hadoop2.8.tgz
echo SPARK_HOME=/home/runner/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8 >> $GITHUB_ENV
A similar approach could be used in your project. Yes, you don't need apt update and you don't need to use sudo. can just unzip it in the home dir
Thanks a lot for the feedbacks! I updated the action v1
It properly fails when the download fails for some weeks already, but I now changed the download URL to use Apache archive so it is much more reliable for more versions
You can also directly provide the download URL using the parameter spark-url
(cf. readme)
Describe the bug
error when install spark in the github action env
Which version of the action are you using?
v1
Environment
If applicable, please specify if you're using a container
Spark Versions 3.0.1
To Reproduce Steps to reproduce the behavior: error when install spark env
Screenshots full traceback