Open tylerkennedy opened 11 months ago
The change makes sense to me. I am not one of the maintainers of the repository, but one thing I would add would be to make the same change in the jupyter notebook.
@ItamarRocha Thanks, good call. I updated the jupyter notebook
I changed the domain and root URL to crawl a different website. After changing the domain to something other than openai.com, I started getting a 403 error on the web requests made by the crawler:
HTTP Error 403: Forbidden
The simple fix was to add a user agent to the web requests in the crawler. This should prevent the issue for anyone else who tries to run the crawler on a different domain