I am looking into langchain and there are so many tools available (doc loaders). I am looking at making this solution go online and scrape websites as an alternative to documents. How would you suggest to do that as easily as possible?
Hi, you can look at tools like scrapy but you should also check the copyright/license, robots.txt etc of websites to see if you are allowed to scrape them.
Hi again,
I am looking into langchain and there are so many tools available (doc loaders). I am looking at making this solution go online and scrape websites as an alternative to documents. How would you suggest to do that as easily as possible?