nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
636 stars 116 forks source link

Add support to private_vpc (ie: with no public address) #296

Closed luhhujbb closed 3 years ago

luhhujbb commented 4 years ago

This PR makes the following changes:

Default behaviour remains identical except that it logs when flintrock detect private mode instead of raising an exception. Fixes #14

nchammas commented 4 years ago

Hey @luhhujbb, thanks for submitting this PR. I will take a look at it over the holidays!

luhhujbb commented 4 years ago

Tests pass with USE_AWS_CREDENTIALS=true in the following condition:

nchammas commented 4 years ago

This is looking good @luhhujbb. I will test this out tomorrow and try to get a private VPC working in my environment. I think I need to setup a NAT gateway to enable the private VPC to still talk to the outside world.

Do you mind if I push some commits to your branch? It will mostly be to tweak some method names or message text that is sent to the user. I've been very slow with the review feedback, so for small things I'd like to spare you the time and just do them myself.

If you're OK with that, let me know and just make sure that the "Allow edits from maintainers" checkbox is enabled in the right side-bar.

luhhujbb commented 4 years ago

Hi, @nchammas, of course you can edit, the box "Allow edits from maintainers" is checked.

nchammas commented 4 years ago

I've successfully created a private subnet that accesses the Internet via a NAT gateway on a public subnet in the same VPC. I've also successfully tested out this PR by launching a cluster in that environment. So far so good. 👍

I've pushed the latest changes from master onto your fork, @luhhujbb, and my next steps will be to retest your PR given the recent changes introduced by #285. I will also push some edits as noted in my previous comment.

(By the way, for the future you may want to make contributions from a branch on your fork, as opposed to from master. It will make pulling updates from the upstream source easier down the line.)

kedar700 commented 3 years ago

Any update on when this might get merged

nchammas commented 3 years ago

Hello @kedar700. Sorry, I haven't been able to look at this PR recently.

I think this PR has the right implementation but, before it can be merged in, it needs some tweaks and needs merge conflicts to be resolved.

It's my fault for leaving this PR in limbo for so long. (Sorry @luhhujbb!)

luhhujbb commented 3 years ago

I've rebase from master. The python 3.5 compatibility was difficult to handle. Maybe it should be dropped since python 3.5 reaches its end of life.

authorize-access-from is now an array of configurable security-group/network cidr, it allows flexible configuration in a private environment.

nchammas commented 3 years ago

Thank you for updating this PR @luhhujbb. I've dropped Python 3.5 support in #329 and will take a look at your updates here later this week. Hopefully we can get this merged in this time around!

luhhujbb commented 3 years ago

@nchammas I've just rebase. Waiting for your review. Thanks for your time.

nchammas commented 3 years ago

@luhhujbb - I made some changes to the PR to tweak the wording or style in various places, and also to refactor how we are validating the CLI input. I also think I fixed some bugs, like .appends -> .append. I tested my changes by launching a cluster into a private VPC with both Hadoop and Spark installed.

  1. What do you think of the latest changes?
  2. Can you test and confirm that the current state of the PR still works for your private VPC use case?
luhhujbb commented 3 years ago

@nchammas First of all many thanks for your time !

  1. I've check all your commits and it looks good.
  2. I've successfully launch a production cluster using this version of flintrock. Everythinks is Ok. Using multipe "ec2-autorize-access-from" allows us to meet all our production use cases.
nchammas commented 3 years ago

Thanks for persisting through the lengthy review/refactor process! I'm glad we got this in.