Open shatanikmahanty opened 1 month ago
@dartpain may I work on this. I will try to achieve this by following examples and testing it out over the coming week!
Seems like a cool idea. I do have one suggestion to make this a killer feature.
Most people might scrape X/twitter once in a while. But what if we do it similarly to https://github.com/arc53/DocsGPT/blob/main/application/retriever/brave_search.py
Such that instead of ingesting data into similarity search vectordb we can create a search query to X/twitter and analyze current data.
@dartpain seems like a really cool addition. Will investigate the suggested integration and start on this by mid next week. Will keep updating the status here!
@dartpain after careful review of the requirements I found out that langchain doesn't have twitter search. They had an open issue in which they mentioned it won't be implemented because of pricing related concerns. Attaching the link to the same: https://github.com/langchain-ai/langchain/issues/11538
Although search can be integrated through using the Twitter search API, I have one concern is how will we process the question put forward by the user as a prompt. In case of LangChain we use the run
method on the search result. If we go with the twitter API, is there anything similar we can do?
I suggest you even use llm to genrate a search query and then use it in the search api
I suggest you even use llm to generate a search query and then use it in the search api
I see, thanks for the suggestion. I will use it accordingly and generate search queries. Once we are done with generating search results, I plan to pass that to the LLM again and summarise search results to give a readable answer
@dartpain I was trying to use the classic rag to generate a twitter query in my local, but it kept on generating the same output of project contribution guide and some other stuff that pointed to github of DocsGPT. By using LLM did you mean something else?
thank you!
@dartpain thanks for the additional context on LLMs. I was able to generate a search term for Twitter using the LLM, but on trying to access the twitter api I found out that it can only be used by paid plan subscribers. If anyone is willing to provide me an api key to test with I can create a PR. Meanwhile I will draft a PR with my current work and highlight the blockers so that in case anyone with access to paid api wants to continue with the rest of the PR they can go ahead. Thanks again for letting me work on this 🚀
🔖 Feature description
Add new remote ingestion method from Twitter
🎤 Why is this feature needed ?
It will allow users to ingest data from Twitter
✌️ How do you aim to achieve this?
I plan to use
https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.twitter.TwitterTweetLoader.html
🔄️ Additional Information
No response
👀 Have you spent some time to check if this feature request has been raised before?
Are you willing to submit PR?
Yes I am willing to submit a PR!