radanalyticsio / openshift-spark

72 stars 83 forks source link

Update to Apache Spark 3.0.0 with Python 3.6 and remove Py 2.7 support #111

Closed tmckayus closed 4 years ago

tmckayus commented 4 years ago

This change moves Apache Spark to version 3.0.0 and at the same time removes support for Python 2.7. Going forward only Python 3.x will be supported. Additionally, the base image has been changed to centos8.

tmckayus commented 4 years ago

note, I haven't tested any of this yet beyond just the build. Expecting the CI tests to kick in here.

elmiko commented 4 years ago

hey Trevor, this looks cool!

i will try to give a review, but no promises ;)

tmckayus commented 4 years ago

Note, I tried switching over to podman exclusively but that requires ubuntu 18.04 which then causes errors with docker and oc cluster up, so instead I just made podman vs docker selectable and used docker in travis ...

Someone with more time someday could maybe move our travis tests off of trusty :)

tmckayus commented 4 years ago

Okay, this seems to work. I tested it using a python s2i with spark 3 and ran SparkPi and a slightly modified grafzahl (changed for python 3) and it worked.

Summary: Python 3.6 Spark 3.0.0 java-11-openjdk centos 8

Also the repo was updated for cekit 3.6