EthicalML / awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
https://ethicalml.github.io/awesome-production-machine-learning
MIT License
17.3k stars 2.22k forks source link

New section on Generative AI #339

Closed axsaucedo closed 1 month ago

axsaucedo commented 1 year ago

With the growing pace of Generative AI models and tools I am wondering whether there could be a space to add a section on Generative AI in this section. It would be great to use this issue to brainstorm on what it could be, and identify whether we can find more than 5 relevant examples for production ML in context of Generative AI. Dalle-Flow is already a good example of a framework that could eb used as an example https://github.com/jina-ai/dalle-flow/.

axsaucedo commented 1 year ago

@zhimin-z it would be great to get your thoughts on this one

zhimin-z commented 1 year ago

@zhimin-z it would be great to get your thoughts on this one

Thanks for invitation, @axsaucedo. Generative AI is definitely an awesome domain to be considered in our list.

1676221238520
axsaucedo commented 1 year ago

Yes absolutely, I do think that would be the best suited area at this stage, but I would like to see if we can find a list of tools like dall-e flow for various Generative AI usecases, namely to see if these could all fit under their own "Industrial Generative AI" section, or whether we would just add them into the respective existing ones.

zhimin-z commented 1 year ago

Yes absolutely, I do think that would be the best suited area at this stage, but I would like to see if we can find a list of tools like dall-e flow for various Generative AI usecases, namely to see if these could all fit under their own "Industrial Generative AI" section, or whether we would just add them into the respective existing ones.

Also, I wonder if we could add commercial tools like jasper.ai, digitalhumans and alexsei (as production-level Generative AI platforms), this is becoming super popular and impactful these days. @axsaucedo

Reference: https://www.analyticsinsight.net/top-10-generative-ai-companies-in-2023/

zhimin-z commented 1 year ago

Also, regarding the pull request on generated data serving tools such as CLIP-as-service, where shall I put it in the list? @axsaucedo

axsaucedo commented 1 year ago

Also, I wonder if we could add commercial tools like jasper.ai and alexsei (as production-level Generative AI platforms), this is becoming super popular and impactful these days. @axsaucedo

At this stage I would be keen to prioritise OSS tools in this issue, once we explore this we could have a look at commercial tools

axsaucedo commented 1 year ago

Here is another project that seems quite promising https://github.com/LAION-AI/Open-Assistant

zhimin-z commented 1 year ago

image

An exclusive Generative AI section seems to touch too many tools (~100) spanning multiple domains, wondering if it is better to split the toolchain into their respective functional sections (like we did right now).

What do you think? @axsaucedo

zhimin-z commented 1 year ago

Interesting, I search over the Internet and found there already exists two similar lists for generative AI:

al-yakubovich commented 1 year ago

One more list: https://github.com/meetpateltech/AI-Infinity

Maybe we can select only open source tools from those?

zhimin-z commented 1 year ago

Is there a standard when we regard the prompt engineering section as an individual? @axsaucedo

axsaucedo commented 1 year ago

The more the field of prompt engineering is defined the less I see it as relevant to this production list, I agree it's an important domain but it's high level in user interaction level to see it as relevant for this list, so I will close #424 as most of these are very high level tools to manage "text templates" which I don't see relevant.

axsaucedo commented 1 year ago

I would still be keen to continue exploring whether Generative AI tools can fall into a separate theme, and one area that I am seeing as potentially relevant is the area that I am currently referring to as "agent-chain architecture frameworks", which provide the infrastructure and tooling to augment LLMs through agents, chains, etc - the primary example of this of course is https://github.com/hwchase17/langchain (https://www.youtube.com/watch?v=nMniwlGyX-c). I would be open to exploring what a list of this "agent-chain architecture tooling" could look like, but I also want to be careful as I am conscious that there are some tools that can mask themselves as tooling infra but they really are just a "good-looking" front-end interfaces to LLMs.

zhimin-z commented 1 year ago

I would still be keen to continue exploring whether Generative AI tools can fall into a separate theme, and one area that I am seeing as potentially relevant is the area that I am currently referring to as "agent-chain architecture frameworks", which provide the infrastructure and tooling to augment LLMs through agents, chains, etc - the primary example of this of course is https://github.com/hwchase17/langchain (https://www.youtube.com/watch?v=nMniwlGyX-c). I would be open to exploring what a list of this "agent-chain architecture tooling" could look like, but I also want to be careful as I am conscious that there are some tools that can mask themselves as tooling infra but they really are just a "good-looking" front-end interfaces to LLMs.

There is a core question: generative ai concerns many aspects such as NLP, CV, RL, etc. How could we distinguish one from another? If we do not set up a standard about what is generative ai compared to the other ML-specific domain, then it is hard to categorize tools.

zhimin-z commented 1 year ago

Another concern is that "generative ai" is an umbrella term commonly used in everyday life rather than in academia or industry. Scientists or ML engineers tell others they specialize in NLP, RL, or CV, but we seldom heard them say things like "I am a specialist in generative ai." "Generative ai" is a very broad area that touches many aspects of AI, almost all tools in our list potentially fall into this area, which makes the categorization unnecessary anymore.

zhimin-z commented 1 year ago

The more the field of prompt engineering is defined the less I see it as relevant to this production list, I agree it's an important domain but it's high level in user interaction level to see it as relevant for this list, so I will close #424 as most of these are very high level tools to manage "text templates" which I don't see relevant.

How do you remark the following graph? I mean, prompt tuning is inseparable in the deployment of LLM for many many cases. LLM companies have the budget for hiring prompt engineers. image image