Does it work with a custom ChatBot trained with own DB with Langchain and OpenAI?

KoljaB / AIVoiceChat

Low latency ai companion voice talk in 60 lines of code using faster_whisper and elevenlabs input streaming

258 stars 49 forks source link

Does it work with a custom ChatBot trained with own DB with Langchain and OpenAI? #1

Open venturaEffect opened 1 year ago

venturaEffect commented 1 year ago

Hi @KoljaB !

First of all congrats on your project.

I'm really interested in using it but need to know if it is possible to use if with my ChatBot created with my custom Data in Langchain. I already use the OpenAI API to make it work but I need to use it with my Chatbot.

Is that possible?

Appreciate

KoljaB commented 1 year ago

Hi @venturaEffect !

Thanks for your kind words on the project! I must admit I'm not too familiar with Langchain. I try to break down your requirement.

Speech-to-Text should be compatible, it's independent of the chatbot implementation.
Text-to-Speech: key would be to get a streaming text output from your Langchain chatbot. To my - limited - understanding of Langchain it offers some possibilities to do this (can't guarantee this 100%), but if the text output can only be obtained after the full chatbot response is finalized, then streaming would be difficult.

Hope this helps, and please don't hesitate to ask if you have any further questions!

Best wishes

venturaEffect commented 1 year ago

Ok, have also no idea if it is possible to get it meanwhile it is generating the text or until it is completely finished.

Will have to test it and ask the Langchain community.

Thanks a lot for your help!

In any case if you like I can provide you my code and you can test it by yourself.

Appreciate

KoljaB commented 1 year ago

Sure, just send it over to my email kolja.beigel@web.de. Would be nice to learn a bit about Langchain while trying to get a streaming text output.

venturaEffect commented 1 year ago

Sent!

KoljaB commented 1 year ago

Looked into that. Langchain seems to only offer a callback for tokens, which would work like this:

from langchain.callbacks.base import BaseCallbackHandler

class MyCallbackHandler(BaseCallbackHandler):
    def on_llm_new_token(self, token, **kwargs) -> None:
        print(token, end="", flush=True) 

model = OpenAI(temperature=0, streaming=True, callbacks=[MyCallbackHandler()])

So we basically have a token stream, but the thing is, elevenlabs expects a generator. I currently have no idea how to bridge that callback system to a generator properly (maybe needs multithreading since the agent.run call is blocking), but this would the key next step.

venturaEffect commented 1 year ago

Wrote to the Langchain Discord server to see if someone came to the solution. To me it seems super useful and the one that gets it will get a lot of views, because it is easy to see that many people will want their custom Chatbot able to have conversations.

Let me know if you come to a solution.

All the best!

KoljaB commented 1 year ago

One thing I also noticed is that the tokens returned from the callback differ from the answer returned from agent.run. Langchain does some processing on the output from the llm. It would be needed to implement that parsing on the text stream too (maybe waiting for "AI: ") before returning any tokens, otherwise Elevenlabs would also speak out that "Thought: " line.

venturaEffect commented 1 year ago

This could make the trick:

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI()

generator = llm.stream("Hi there")

it'll stream "AIMessageChunk"'s in this case so you could make it

from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

llm = ChatOpenAI() | StrOutputParser()

generator = llm.stream("Hi there")

What do you think?

KoljaB commented 1 year ago

Yes, with this code it will work. A basic example would look like this:

import os
import openai
from elevenlabs import set_api_key, generate, stream
from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

set_api_key(os.environ.get("ELEVENLABS_API_KEY"))
openai.api_key = os.environ['OPENAI_API_KEY']

llm = ChatOpenAI() | StrOutputParser()

generator = llm.stream("Hi there")
audio_stream = generate(text=generator, voice="Nicole", model="eleven_monolingual_v1", stream=True)
stream(audio_stream)