Closed tmckayus closed 6 years ago
@crobby @elmiko ptal
i am trying to test out the functionality of this pr, so far this what i've gotten
make
in the root of the project and all images builtdocker run --rm -it --entrypoint=/bin/sh openshift-spark:latest
and then checked to see what was installed, i see the latest spark is installed. looks like this is a standard community image at this point.172.30.1.1:5000/foo
openshift-spark
image and ran the following command inspired by the build_env_var
test from install_spark.sh
: oc new-build --name=spark --docker-image=172.30.1.1:5000/foo/openshift-spark:latest -e SPARK_URL=https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz -e SPARK_MD5_URL=https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-without-hadoop.tgz.md5 --binary
install_spark.sh
test, but got an error:
[mike@shift openshift-spark]$ oc start-build spark
error: Build configuration foo/spark has no valid source inputs, if this is a binary build you must specify one of '--from-dir', '--from-repo', or '--from-file'
not sure what to do here, this appears to be exactly what the test is doing after the new-build
command, but for some reason i am getting this error.
is there a better way to test this?
also, i think we need some instructions on how to do this, i am having a really difficult time figuring out what i'm supposed to be doing.
just for completeness, here is my entire output
[mike@shift openshift-spark]$ oc new-build --name=spark --docker-image=172.30.1.1:5000/foo/openshift-spark:latest -e SPARK_URL=https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz -e SPARK_MD5_URL=https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-without-hadoop.tgz.md5 --binary
W0914 15:06:22.175344 17306 dockerimagelookup.go:233] Docker registry lookup failed: Get https://172.30.1.1:5000/v2/: http: server gave HTTP response to HTTPS client
W0914 15:06:22.204925 17306 newapp.go:464] Could not find an image stream match for "172.30.1.1:5000/foo/openshift-spark:latest". Make sure that a Docker image with that tag is available on the node for the build to succeed.
--> Found Docker image 2da8c62 (40 minutes old) from 172.30.1.1:5000 for "172.30.1.1:5000/foo/openshift-spark:latest"
* A Docker build using binary input will be created
* The resulting image will be pushed to image stream "spark:latest"
* A binary build was created, use 'start-build --from-dir' to trigger a new build
--> Creating resources with label build=spark ...
imagestream "spark" created
buildconfig "spark" created
--> Success
[mike@shift openshift-spark]$ oc start-build spark
error: Build configuration foo/spark has no valid source inputs, if this is a binary build you must specify one of '--from-dir', '--from-repo', or '--from-file'
Ah, there is a second make file, and a second set of images.
"make -f Makefile.inc" will build openshift-spark-inc and openshift-spark-inc-py36
These are the images that can be completed.
Ultimately we'll have a script for this, not there yet
Here is a test that can be done to show that an incomplete image used to deploy a cluster fails with a usage script (incomplete image has to be tagged for oshinko)
$ make -f Makefile.inc build
$ docker tag openshift-spark-inc:latest 172.30.1.1:5000/myproject/openshift-spark-inc:latest
$ docker login -u developer -p $(oc whoami -t) 172.30.1.1:5000
$ docker push 172.30.1.1:5000/myproject/openshift-spark-inc:latest
$ oshinko create mary --image=172.30.1.1:5000/myproject/openshift-spark-inc:latest
Check the logs for master and worker (not sure how to force openshift to preserve blank lines in output)
Here's how to complete an image using env vars (with oc cluster up, the image can be used in this case straight from the local docker daemon, no need to tag)
$ oc new-build --name=openshift-spark --binary --docker-image=openshift-spark-inc:latest -e SPARK_URL=https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz -e SPARK_MD5_URL=https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-without-hadoop.tgz.md5 --binary
$ oc start-build openshift-spark
$ oc log -f buildconfig/openshift-spark
To complete an image using files from a local directory
$ mkdir buildfiles
$ wget https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz -O buildfiles/spark-2.2.1-bin-hadoop2.7.tgz
$ wget https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz.md5 -O buildfiles/spark-2.2.1-bin-hadoop2.7.tgz.md5
$ oc new-build --name=openshift-spark --binary --docker-image=openshift-spark-inc:latest
$ oc start-build openshift-spark --from-file=buildfiles
$ oc log -f buildconfig/openshift-spark
A succesful build will push a completed image to your project. To run the completed image with oshinko:
oshinko_linux_amd64/oshinko create molly --image=172.30.1.1:5000/myproject/openshift-spark:latest
From here, using buildfiles, you can do things like change the md5 file to make it fail on a build, leave out the spark tgz altogether, replace the spark tgz with a file that's not actually a tar, etc, and check that the build fails.
This is an initial change to allow spark to be installed during an s2i build if it is left out of the creation of the openshift-spark image.
This is a step toward a mechanism which will allow users to build openshift-spark images with custom spark installs in OpenShift Origin rather than modifying github repos and running local builds.