dwyl / learn-elixir

:droplet: Learn the Elixir programming language to build functional, fast, scalable and maintainable web applications!
1.6k stars 107 forks source link

Building AI Apps with Elixir - Charlie Holtz - ElixirConf 2023 #212

Open nelsonic opened 9 months ago

nelsonic commented 9 months ago

ElixirConf 2023 - Charlie Holtz - Building AI Apps with Elixir: https://youtu.be/TfZI5-oQSqI

image
nelsonic commented 9 months ago

"Magic Box" ...

image

In a nutshell, this is the problem with AI. 😕

nelsonic commented 9 months ago

Recommend watching. Thinking of building something similar.

nelsonic commented 9 months ago

@LuchoTurtle keen to hear your thoughts/feedback. 💭

LuchoTurtle commented 9 months ago

It was a fascinating talk! The way each one was explained was clear-cut and simple, but it shows how powerful we can use AI models (hopefully open-source) to tackle otherwise challenging parsing scenarios.

Shout-out to Task.async, makes it look super easy to shoot a process to the AI model and not having to wait for it - https://github.com/dwyl/elixir-http-request-tutorial/issues/2#issuecomment-1688501363.

GenServer

The GenServerapproach piqued my interest, mainly because I've had the chance to see how LLMs are prompted in these applications (like through Langchain) and how they are parsed. I've seen Rafa using Azure AI's platform to deploy a simple chatbot, like in the image below:

image

Adding a system message and specifying how the bot should act is prompt engineering (even though considering this "engineering" is debatable by itself but whatever) at its core and it's funny how crucial it is to get the models to properly understand user queries and how to properly get an acceptable output (which is why things like https://github.com/guidance-ai/guidance exist).

The reason why I'm saying this is because it's interesting seeing him employ these techniques to yield correct results from the agent, which is what makes these demos successful.

image

And the way he's parsing the output from the model is exactly why frameworks like guidance exist.

image

When adding AI to an application, the hardest part seems to be how to get a good prompt going. I've had my run-ins with https://github.com/CompVis/stable-diffusion and 99% of the work is prompting, knowing which keywords go hand-in-hand, and adding or reducing weights to some keywords.

But when using LLMs to get these simple responses and answers as shown in the video (and actually using an LLM to check the content of the answer and not having to consider all the edge scenarios when parsing user input) is THIS simple, even more so with platforms like Azure AI making it even easier for us, the hardest part is being creative with it.

Was actually curious about the ~X sigil he implemented, was looking forward to knowing more about it :/

Generative Agents

Got SUPER excited with the third approach of self-sovereign and aware agents and how EASY it seems to make it in Elixir!

image

The Shinstagram example was uber-cool. Seeing each agent interact with each other whilst having their own personalities, all in real-time, looked pretty much like real life! I would LOVE to see the code for the demo, I absolutely adored this part of the presentation.

It's so interesting to see how this approach can effect other fields. For example, games! Characters in Red Dead Redemption 2 (apparently) had their own routines hard-coded. But imagine how much more dynamic (thus boosting replayability) the world would be if each one was an LLM agent, with their own set of stories/memories, and seeing them interact with each other and the player! I know for sure this is already being implemented somehow, just can't wait to see it!

I'm already starring https://github.com/replicate/replicate-elixir! And seeing https://replicate.com/collections/image-to-text, it's curious to see that BLIP is available (which I previously mentioned in https://github.com/dwyl/image-classifier/issues/1). So this could very much be viable in imgup (although it's paid, so it might be easier to scale it down to a local model for now).