Open talebzeghmi opened 4 years ago
We have been thinking about (1) [as graph composition] and hopefully will publish more details on the thoughts we have about it. cc @tuulos For (2) - you could still get the sharing esp. for feature engineering transform as a library of functions (instead of steps); that can just be imported within your step. Some of our team internal to Netflix employ this route for sharing such business logic.
Also, for relatively common collection of transformations you could still use (1) if you want to even reduce the step boilerplate from being repeated.
@talebzeghmi Thank you for opening this issue! Your issue has articulated some of the exact metaflow architectural questions that our team has been having around productionizing/pipelining metaflow ... especially around the reusability of feature engineering code within multiple flows.
I don't want to have to copy and paste scikit-learn Transformer code to each new modeling flow especially when there is a lot of boilerplate/utility code that I've written around: 1) leveraging pandas to protect against differing columns being passed in. 2) pulling in a tagged 'production' model from a Run that is then reloaded for just the data 'transform' and not the 'fit' as well.
@seeravikiran Thanks for some of the recommendations regarding structuring and code reusability to address some of items presented in this issue. I will continue to investigate what that would look like on our end. In the meantime, I would like to point you to this post made on the metaflow community page that actually proposes a pretty interesting idea to the issue. I'm curious as to your thoughts on this (or something like this).
As @seeravikiran pointed out above, we have plans for graph composition. Meanwhile, this form of subclassing is supported https://github.com/Netflix/metaflow/issues/144#issuecomment-592245062
@tuulos Thanks for the response and the reference. This is extremely helpful and greatly appreciated!
@tuulos, would you be able to share an RFC kind of document on how Metaflow would support composition? In this way we can give feedback from our Applied Scientists on it's usability, before the code is written.
thank you!
@talebzeghmi yep, I have been writing a doc that I should be able to share this month. I will ping you when it is available. Thanks for your patience :)
Hello @tuulos , any news since this doc you've been writing in 2020 regarding metaflow composable flows?
Hi @tuulos,
is there any progress with respect to this topic? Would be extremely helpful for our use business case we are having right now :) Any feedback appreciated.
Cheers
Large ML projects spanning teams reuse pipelines and models (ex: ensembles, feature engineering, etc).
There are two aspects of reuse:
A Use Case:
related: https://github.com/Netflix/metaflow/issues/144