Open MayankChaturvedi opened 1 month ago
I would love to contribute to this issue @MayankChaturvedi
I love the idea!
So to make it even more clear:
If that is the workflow, I think we should go forward. Also, it would be great to not make this very complicated. We would like to see a very simple notebook that does what it needs to while not making it too complicated.
PS: I have added this issue to the main issue #43
hello @MayankChaturvedi as proposed by @ariG23498 in this issue #55 I would love to contribute to this work as well. Let me know if you want to divide/collab in any subtask for this.
Hey @ariG23498 I was redirected to #47 , thank you for that! I would also love to join this team @MayankChaturvedi . Please let me know if there is space for collaboration here too!
Hi folks, thanks for your interest in the issue. We need a simple notebook. I will create a branch so that three of us can collaborate on it. Meanwhile I'll also come up with a distribution of tasks. Let's collaborate on a discord group? - https://discord.gg/rhbqXsyX @ariG23498 does this setup sound good?
@MayankChaturvedi the collaboration sounds great!
Let me know if you folks need help -- the best way of reaching me is this issue. It would be open for others to view and learn 🤗
Hi @MayankChaturvedi I would love to collaborate on this issue. Let me know if I can contribute to this issue.
A notebook that demonstrates how to use a multimodal RAG that combines two types of inputs, such as text and images, to retrieve relevant information from a dataset and generate new outputs based on the retrieved data.
Example Input: Takes a text query along with an image (e.g., "Which fruit is this?") Retrieval: Uses the image and the text to retrieve relevant documents or facts from a knowledge base or external dataset (e.g., Wikipedia articles on animals). Generation: The system generates a coherent response based on the retrieved information (e.g., "This is a blueberry!").