llamaindex-mongodb-GPT4All

header

Note This is not intended to be production-ready or not even poc-ready. This is just a fun experiment!

This repo contains a Python notebook to show how you can integrate MongoDB with LlamaIndex to use your own private data with tools like ChatGPT. Your data are fed into the LLM using a technique called "in-context learning". To do so we leverage the Mongo Loader available in LlamaHub. A big part of this exercise was to demonstrate how you can use locally running models like HuggingFace transformers and GPT4All, instead of sending your data to OpenAI. All the code can be executed completely on CPU.

The step are explained in the notebook but basically I leveraged the sample_mflix-movies collection part of the sample dataset available in MongoDB Atlas. We index the documents in the collection and on top of them I created a fictitious document for a fictitious movie called "The Paolo Picello movie", describing the life of a Solutions Architect trying to build cool apps with AI and MongoDB.

movie

I then provided the following question to the system:

What is the name of the movie that talks about a computer engineer trying to build a demo of how you can leverage AI tools to answer questions around data stored in MongoDB?"

and the system answer:

The name of the movie is "PaoLo Picello".

Interestingly, the system was able to get my name out of its corpus. This is not the exact name we specified in the MongoDB document ("The Paolo Picello movie") but it's still a quite impressive result.

movie

Note The system allucinate quite a lot, giving most of the times pretty random results. But it's still fascinating to see the system able to get my name out as response.

Possibile next steps

Levearage MongoDB Atlas as vector database
Find a much more relevant example/use case
Integrate with HuggingChat
Run on GPU
Improve the Mongo Loader available in LlamaHub. You have too little control over the document creation.

Credits

This notebook is inspired by the LlamaIndex - Local Model Demo.ipynb notebook referred in the LlamaIndex documentation.

We welcome comments and contribution!!

ppicello / llamaindex-mongodb-GPT4All

readme

llamaindex-mongodb-GPT4All

Possibile next steps

Credits