opendatahub-io-contrib / s2i-spark-container

GNU General Public License v3.0
1 stars 6 forks source link

Update custom spark build to support s2i-thoth-ubi8-py36 #6

Open LaVLaS opened 3 years ago

LaVLaS commented 3 years ago

With the update to thoth notebook builds, this image no longer imports pyspark when built using s2i-thoth-ubi8-py36 as a base image. I think the fix is to build spark & hadoop from the spark source repo and then build the wheel file for a local pip install.

vpavlin commented 3 years ago

Seems like the thoth base image is not the culprit of the issue - https://github.com/opendatahub-io/s2i-spark-container/commit/9f28593f741a26675bd9b01f3762aabc29782521#diff-bc705d3ab5dd034e3ce0b4943c294895L7 - sorry I missed it in the review

LaVLaS commented 3 years ago

I agree. The setup for installing spark with a custom hadoop version needs to be refactored/cleaned up. I think we can easily build and install a cleaner pip package from the spark repo that will be automatically discovered by python using the defaults