Closed aaronsteers closed 5 months ago
Hello @aaronsteers, I am interested in this issue, can you assign me this> Should I post blog on medium or hashnode?
It is yours! @Jeeesrw322 we're going to post in Airbyte blog but you can also post them in other sites too.
@aaronsteers @bindipankhudi Do we need to use Pyairbyte and Langchain for this? I did not get the full picture. Can you explain this further?
@Jeeesrw322 i have added more details to the ticket. Hope it is clear now. Pls do not hesitate to reach out if you have more questions.
@Jeeesrw322 if you have not started working on this issue, Can I take this one?, I have already worked on almost similar issue
@avirajsingh7 Yes, you can take this issue
@bindipankhudi please assign me this issue, I will be creating PR soon
@avirajsingh7 assigned to you.
Summary
Many Airbyte users want to scrape data from websites into their LLM models. The
APIfy
source can assist with this but not enough user guides are available as of now. The goal of this tutorial is to show users how to use Apify to scrape data, how to set up the Apify source using PyAirbyte, and then load the data into a vector store using Langchain.Description
This tasks involves the following steps
Definition of Done
Blog post / python notebook. When providing a python notebook, please add a "What / Why / How" blurb at the top to explain what the code is doing.
Resources to Assist