GoogleCloudDataproc / bdutil

[DEPRECATED] Script used to manage Hadoop and Spark instances on Google Compute Engine
https://cloud.google.com/dataproc
Apache License 2.0
109 stars 94 forks source link

Adding Apache tajo scripts of bdutil for extensions deploy. #67

Closed hys9958 closed 8 years ago

hys9958 commented 9 years ago

Hi, I have added scripts of bdutil for apache Tajo. "extensions/tajo" Thanks.

googlebot commented 9 years ago

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project, in which case you'll need to sign a Contributor License Agreement (CLA).

:memo: Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


hys9958 commented 9 years ago

I signed it! Thanks.

googlebot commented 9 years ago

CLAs look good, thanks!

hys9958 commented 9 years ago

Hi, I wrote a bdutil extention for Apache Tajo, an open source SQL-on-Hadoop system (http://tajo.apache.org) and send merge request for the code. This extension enables Google cloud users to deploy and setup Tajo cluster and start their big data analysis on GCP.
Can you please review the Tajo extension? Thanks!

AngusDavis commented 9 years ago

Thanks for the updates. I'm trying to run through the version and it appears the tajo_release bucket does not contain a versioned tarball, just tajo.tar.gz. Any chance you could copy in tajo-0.11.0.tar.gz and make it public (or just make it public if it's already there)?

One more point on the bucket - GCS bills to the project that owns the bucket and bdutil clusters could be large and frequently spun up / down. Please verify that you're OK with this being billed to your cloud project.

For testing, I've copied tajo-0.11.0.tar.gz to a bucket in one of our projects and other than the initial value of TAJO_TARBALL_URI it appears things are working well.

For expediency (I admit, I don't really get to use that term with a PR that's lingered this long), I may use the tajo-0.11.0.tar.gz I used for testing (I'm working to get a release out soon), but can revert to using tajo_release if it's updated before I cherry-pick in this change.

ykko commented 9 years ago

Hi Angus I uploaded tajo-0.11.0.tar.gz on tajo-release bucket.

As for the bucket, I'm wondering if a user launches 100-node Tajo cluster using bdutil, and Tajo tarball is located in our bucket, then we are going to be billed for 100 x 56MB data transfer, did you mean it? Please correct me if I got it wrong.

We are OK to locate the tarball in another bucket as we think it is better for credibility, provided that we can update or request to update the new tarball fast. So, may I request you to:

Thanks a lot. Youngkyong

AngusDavis commented 8 years ago

Hi Youngkyong,

You're understanding of billing is correct. I've created the following bucket: gs://tajo-dist and uploaded tajo-0.11.0.tar.gz. I've updated tajo_env.sh in my cherry-pick commit to point to this location as well.

To update versions, a PR or Issue + a mail to (my GH username) at google.com should take care of things (and if it's a time-sensitive or critical bug fix, please mention that as well).

I intend to push a cherry pick shortly (just the new URL and removed some trailing whitespace).

Thanks, Angus

ykko commented 8 years ago

Great! Thanks a lot.