GoogleCloudDataproc / initialization-actions

Run in all nodes of your cluster before the cluster starts - lets you customize your cluster
https://cloud.google.com/dataproc/init-actions
Apache License 2.0
588 stars 513 forks source link

[bigtable] bigtable.sh references deprecated Hortonworks Nexus resources #1198

Open cjac opened 1 month ago

cjac commented 1 month ago

To resolve this dependency on deprecated infrastructure, we need to make copies of all jars we depend on and store them in GCS, then we need to update bigtable.sh to fetch from the GCS location instead of hortonworks nexus server.

prince-cs commented 1 month ago

Will be using the gs://dataproc-initialization-actions bucket to store the jars. It has the basic repo structure of our init repo. Inside of bigtable folder, I will make the same structure for storing the jars so as to ensure that minimal code changes are required in the bigtable init script.

cjac commented 1 month ago

Might as well copy the tar.gz packages from archive.apache.org as well. They're flakey to download.