nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
637 stars 116 forks source link

Question about TODOs #377

Open tomahawk360 opened 5 months ago

tomahawk360 commented 5 months ago

Hi! First of all, thanks for developing this tool. Having the experience of building Hadoop clusters "by hand", flintrock has been extremely useful. I decided to see how flintrock works, and found these TODOs, more specifically the ones about "Centralizing logic" to get ec2 clusters characteristics:

https://github.com/nchammas/flintrock/blob/c3dee6da06af4e6ca094ac6f0b4365d091c2541b/flintrock/ec2.py#L179C1-L206C31

What does "Centralize logic" stands for? What's left to be done in those sections?

nchammas commented 5 months ago

I wrote those comments almost 10 years ago, so I will have to take a guess at what exactly I meant. :)

https://github.com/nchammas/flintrock/blob/c3dee6da06af4e6ca094ac6f0b4365d091c2541b/flintrock/ec2.py#L180-L186

This probably just means I wanted to write a little function that I could reuse across the codebase to get this base security group. I'm not sure how many places would need to be refactored or if such a function is really needed, but I think that's what I meant when I wrote it.

https://github.com/nchammas/flintrock/blob/c3dee6da06af4e6ca094ac6f0b4365d091c2541b/flintrock/ec2.py#L199-L205

This is probably the same idea. There must be several places in the codebase with similar logic, where I'm trying to get the cluster-specific security group based on the cluster name. Instead of repeating that logic, a reusable function would be handy.

tomahawk360 commented 5 months ago

Hello, sorry for the late response.

Indeed, the same code bit repeats on three different occasions, each part having the same TODO comment, so it must refer to a replacement of those bits with a reusable function.

However, I found a function that could serve said purpose in the codebase:

https://github.com/nchammas/flintrock/blob/c3dee6da06af4e6ca094ac6f0b4365d091c2541b/flintrock/ec2.py#L493-L516

This function is only used in the launch method, but it could be used for the "Centralize logic" TODOs as well, like for example the first TODO instance could be rewritten as:

flintrock_base_group = get_security_groups(
    vpc_id=self.vpc_id,
    region=self.region,
    security_group_names=['flintrock']
)[0]

Is there a reason why that function is not being used for the "Centralize logic" TODOs?

tomahawk360 commented 4 months ago

Hello, I made the following PR regarding this Issue: https://github.com/nchammas/flintrock/pull/378

nchammas commented 3 months ago

Thank you. Sorry I have not had a chance to look at this, but it is on my list to do. I just wanted to acknowledge your PR in the meantime.

tomahawk360 commented 2 months ago

Hello. No worries, take your time and thanks for having the PR in mind.