cdfoundation / sig-mlops

CDF SIG MLOps
https://cd.foundation
Apache License 2.0
597 stars 68 forks source link

New challenge: Education of the respective skillsets that must work together #59

Closed epec254 closed 1 year ago

epec254 commented 1 year ago

After discussion with @tdcox and @AlmogBaku , proposing a challenge related the need to train folks from the respective data science and software engineering fields in their respective field's best practices. This is particularly relevant as most applications will include ML as a core part of the feature set.

@tdcox - one call out based on our chat. I decided to avoid including statements on on whether Data Scientists will become Engineers, Engineers will become Data Scientists, there will be a new, unified role, or any other variants of what the roles/skill sets of invididual users will be. In my conversations with many practioners, I've heard each of these "future world views" as well reasoned arguements (e.g., there is not a single, accepted POV) and thus believe including the specific roles as part of the challenge may distract from the broader goals of the roadmap. I would welcome your feedback here.

tdcox commented 1 year ago

Thanks. I see where you are going here, but would suggest a slightly different tack in order to nail the fundamental problem. This isn't really a Data Scientist vs Software Developer scenario. The knowledge gap is between Data Science and Product Commercialization.

By this, I am referring to the past 30 years or so of best known methods in Product Discovery, Lean Startups and Product Lifecycle Management. On the latter side, you have alignment across CEO, Sales, Marketing, Legal, Operations, Engineering etc, to a unified way of efficiently delivering product. On the former, there is a lack of awareness about how all these moving parts are aligned, so there is a lot of reinventing the wheel going on, as if ML was the product, rather than being a component in the product.

In the reverse direction, the challenge is mainly to communicate the additional requirements that ML brings to asset management.

One way of observing the problem is to contrast DevOps and the common misunderstanding of "MLOps":

There is no such thing as a "DevOps Team". DevOps is a methodology followed by a Product Team, where individuals from multiple specialities work together to create a unified product that meets the functional and non-functional requirements of all the stakeholders. That team is also responsible for the operation, ownership and maintenance of that product.

The 'wrong end of the stick' view of "MLOps" is that it is an ML-only discipline, used by Data Scientists to optimize for the convenience of deploying models (by throwing them over the fence to the Product Team). This is the anti-pattern that we need to address.

A functioning MLOps team is a DevOps team with Data Scientists participating in the product design, using tools that treat ML and conventional assets consistently...

AlmogBaku commented 1 year ago

Thanks. I see where you are going here, but would suggest a slightly different tack in order to nail the fundamental problem. This isn't really a Data Scientist vs Software Developer scenario. The knowledge gap is between Data Science and Product Commercialization.

By this, I am referring to the past 30 years or so of best known methods in Product Discovery, Lean Startups and Product Lifecycle Management. On the latter side, you have alignment across CEO, Sales, Marketing, Legal, Operations, Engineering etc, to a unified way of efficiently delivering product. On the former, there is a lack of awareness about how all these moving parts are aligned, so there is a lot of reinventing the wheel going on, as if ML was the product, rather than being a component in the product.

In the reverse direction, the challenge is mainly to communicate the additional requirements that ML brings to asset management.

One way of observing the problem is to contrast DevOps and the common misunderstanding of "MLOps":

There is no such thing as a "DevOps Team". DevOps is a methodology followed by a Product Team, where individuals from multiple specialities work together to create a unified product that meets the functional and non-functional requirements of all the stakeholders. That team is also responsible for the operation, ownership and maintenance of that product.

The 'wrong end of the stick' view of "MLOps" is that it is an ML-only discipline, used by Data Scientists to optimize for the convenience of deploying models (by throwing them over the fence to the Product Team). This is the anti-pattern that we need to address.

A functioning MLOps team is a DevOps team with Data Scientists participating in the product design, using tools that treat ML and conventional assets consistently...

WORD.

epec254 commented 1 year ago

I took a pass at these updates, looking forward to your feedback.