radanalyticsio / openshift-spark

72 stars 83 forks source link

Spark 2.4 #80

Closed elmiko closed 5 years ago

elmiko commented 5 years ago

This change brings in support for Spark 2.4.0, it also updates the versioning on the incomplete images.

elmiko commented 5 years ago

@tmckayus @crobby @Jiri-Kremser ptal

elmiko commented 5 years ago

~one more thing i found while reading through the change-yaml.sh file, i need to update the md5 based logic in there to use the new sha512 stuff.~

~scratch that, i was wrong~

scratch that, it's complicated.

from the archives, some time around may 2018 it looks like the spark team starting distributing only pgp signatures (.asc files) and sha512 sums (.sha512 files). because there were a few backports that were released after this policy went into effect you see strange things like; spark version 2.3.0 carries md5 sum files but 2.2.3 does not.

the change-yaml.sh script will attempt to download the md5 files to confirm the validity of the archive file and then use that md5 in the image.yaml cekit file. this will cause the script file to exit with an error.

it would be easy if we could just switch to use the upstream sha512 sums to validate the archive and inform cekit about, unfortunately cekit only has support for md5, sha1, and sha256 from the schema file.

i think the best thing to do is use the sha512's to validate the archive, then calculate an md5 from the archive to put in the schema file. big downside here is that you will need to download the archive to calculate the md5.

another option is to remove the checksum altogether, but this seems like a sacrifice of better security practice for a savings in time during configuration.

elmiko commented 5 years ago

fwiw, i made a request to the cekit project for a feature =)

https://github.com/cekit/cekit/issues/471

tmckayus commented 5 years ago

I think while we're waiting for the cekit feature, we can just do the extra download.

elmiko commented 5 years ago

ok, cool. i'll get that patch up

todo[bot] commented 5 years ago

remove this download when sha512 support lands in upstream cekit (elmiko)

https://github.com/radanalyticsio/openshift-spark/blob/f2f00bbe6156cfc8b33689d37bbf3ce2f094e4f6/change-yaml.sh#L56-L61


This comment was generated by todo based on a TODO comment in f2f00bbe6156cfc8b33689d37bbf3ce2f094e4f6 in #80. cc @elmiko.
todo[bot] commented 5 years ago

remove this checksum calculation when sha512 support lands in upstream cekit (elmiko)

https://github.com/radanalyticsio/openshift-spark/blob/f2f00bbe6156cfc8b33689d37bbf3ce2f094e4f6/change-yaml.sh#L69-L74


This comment was generated by todo based on a TODO comment in f2f00bbe6156cfc8b33689d37bbf3ce2f094e4f6 in #80. cc @elmiko.
todo[bot] commented 5 years ago

replace this with sha512 when it lands in upstream cekit (elmiko)

https://github.com/radanalyticsio/openshift-spark/blob/f2f00bbe6156cfc8b33689d37bbf3ce2f094e4f6/change-yaml.sh#L82-L87


This comment was generated by todo based on a TODO comment in f2f00bbe6156cfc8b33689d37bbf3ce2f094e4f6 in #80. cc @elmiko.
elmiko commented 5 years ago

added the sha512 changes to change-yaml.sh, just need to update the docs and we should be good to go.

elmiko commented 5 years ago

@tmckayus added more content about the script files, let me know what you think.

jkremser commented 5 years ago

\o/