Training Chatgpt with PDF, Doc or Website Scraper?

twilio-labs / call-gpt

Generative AI phone call toolkit using Twilio Media Streams.

MIT License

251 stars 103 forks source link

Training Chatgpt with PDF, Doc or Website Scraper? #12

Closed mbui41 closed 7 months ago

mbui41 commented 7 months ago

Hi, I just come across this project and tested it out. It worked wonder. I am wondering if we are able to train the Chatgpt with our own data such as PDF, Doc , text files or Website Scraper?

Thank you

cweems commented 7 months ago

Hi @mbui41, while this is supported via the ChatGPT Assistants, the Assistants API doesn't yet support streaming completions. This project requires streaming because it allows the bot to respond in near-real time.

With that said you can still extract text from a PDF to fine-tune a model. This post has some more details about how you might go about doing that: https://community.openai.com/t/training-with-large-pdf-files/46994/4