Open p-acDev opened 2 years ago
BeautifulSoup is pulling data out of HTML and XML files.
That means is a static way of extracting data. Because it is static it does not work on webpages that load the results dynamically, for example, your web page that has a load more button instead of a page number bar.
A solution for this is using a library that scrapes the data dynamically.
Selenium is such a library that might satisfy your needs.
Hello @AlexMihalache99 , humm ok thanks a lot. I will create a new branch and investigate this library which I have never used. Also feel free to implement if you wish =D.
Hi @p-acDev To directly get all the repositories without manually clicking the "see more" button when using Beautiful Soup for topic filtering on GitHub, you can make use of the GitHub API. The API provides a way to search repositories based on topics and provides all the relevant information in a single response.
when applying the topic filtering, GitHub only render few repos. In the beautiful soup process, how can we directly get all of the repos without manually click on "see more" button ?