etsy / boundary-layer

Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform
Apache License 2.0
262 stars 58 forks source link

Does boundary-layer have active support? #114

Closed peleyal closed 3 years ago

peleyal commented 3 years ago

Hi all!

In the last year we have been using this wonderful library.

We love it a lot, and as one of our engineers keep saying "Boundary Layer helps us to not be in the business of generating python code". We definitely prefer YAML.

However, in the last weeks we saw a decrease support in PRs that we need, such as #110 and #112.

Before we take the approach of forking and applying changes we need on our own fork, I thought it will be better to reach out to you guys, and specifically add some of the folks that created the last PRs (such as @vchiapaikeo, @dossett, @gpetroski-etsy and @mchalek).

It will be really great if we can get the minimal support we need (updating configuration, to support more operators...), so other Boundary Layer users can enjoy more operators that exists our there in Airflow...

Thoughts? Thank you! Eyal

@eap, @bthomee and @jcraver1021 FYI.

vchiapaikeo commented 3 years ago

Hey @peleyal - first off, glad to hear that this library works well for you and your team. Apologize that we haven't been more responsive here.

I think one of the difficulties with your Cloud Function operator is that we haven't heavily tested boundary-layer with different versions of Airflow. This class doesn't seem to exist in our version of Airflow (1.10.3). I'd like to do a bit more testing there and ensure there are no unintended side-effects by adding it.

Another alternative to forking the repo is to maintain a plugin. We do this at Etsy and we use it to build schemas for our internal operators. All that would be required is to create a Python module that requires boundary-layer and extends from BasePlugin, similar to what is done here with DefaultPlugin. You'd also be able to define the priority of your operators, sensors, etc. vs the defaults (ours is set to PluginPriority.FINAL) and any other miscellaneous actions you'd need.

peleyal commented 3 years ago

Victor, thank you very much for the quick reply!

I think one of the difficulties with your Cloud Function operator is that we haven't heavily tested boundary-layer with different versions of Airflow. This class doesn't seem to exist in our version of Airflow (1.10.3). I'd like to do a bit more testing there and ensure there are no unintended side-effects by adding it.

Sure! Let's take this one on the PR itself (and thank you for taking action on #112!).

Another alternative to forking the repo is to maintain a plugin. We do this at Etsy and we use it to build schemas for our internal operators. All that would be required is to create a Python module that requires boundary-layer and extends from BasePlugin, similar to what is done here with DefaultPlugin. You'd also be able to define the priority of your operators, sensors, etc. vs the defaults (ours is set to PluginPriority.FINAL) and any other miscellaneous actions you'd need.

I thought about it too, but in case of #112, our own plugin won't help, right? It's more for operators that are not currently supported (which is most of our cases to be honest). Or maybe I'm wrong... Maybe we can overwrite boundary_layer_default_plugin/config/operators/base.yaml as we will mark our plugin (as you suggested) as PluginPriority.FINAL. Is it right?

Again thank you so much for the quick reply! Looking forward working with you and you team in the near future!

Eyal

mchalek commented 3 years ago

Hey @peleyal , agreed with @vchiapaikeo , it is great to hear that you are finding boundary-layer to be useful!

I must apologize personally for not being more involved (to both you as well as @vchiapaikeo and @dossett ) — I am still at Etsy but no longer on a team that is a heavy user or maintainer of Airflow, so it is hard for me both to find time for boundary-layer, as well as to maintain enough context to review new submissions. I would like to try to help out more, and maybe this message can serve as the impetus for that… but it is hard for me to commit, there are only so many hours in the day…

vchiapaikeo commented 3 years ago

@peleyal , your own plugin would help here. I created a quick template for this type of module and in the README, there's an example of how you could do this yourself:

https://github.com/vchiapaikeo/boundary-layer-mycompany-plugin

Let me know if that works out for you!

And totally hear ya Kevin - thanks for creating this awesome library! :-) I'll try to do better at keeping up w/ these requests in the future.