The Bing search is interesting as it is a search engine that stands out, and because we simply wanted to compare it to another popular search engine, using Bing search is an option.
Cons
However, after examining the pricing, I discovered that only 1,000 transactions are free per month. This equals approximately 33 free requests per day. 2 requests are needed for a single testing (QnA) file.
Tasks
[x] Refactored snippet of code
[x] Bing Global Web Search integration.
[x] Bing Search by filtering the results only for the website inspection.canada.ca.
[x] Reviewed calculaton of accuracy score it to consider multiple URLs for the same query.
[x] Development of a script to convert an excel into JSON files.
[x] Inclusion of a new data table for the highest and null scores.
[x] Fixed time response variations
Closes
closes #6
closes #11
closes #10
Alternative considered
Google API
Since Google Search Api limits results by 10 at a time and we have at least 20 files, each needing 100 results for testing, we will need at least 200 requests to obtain all the answers. Google only offers 100 free requests by day. After $5 are charged per 1000 requests, around $1 per test.
num
integer: Number of search results to return. Valid values are integers between 1 and 10, inclusive.
Web scraping has been attempted as it allows for querying completely free of charge. However, Google has incorporated stringent security measures to limit the number of requests. Since Google displays only 10 results at a time and we have at least 20 files, each needing 100 results for testing, we will need at least 200 Google requests to obtain all the answers. Even with time delays, the 200 requests never succeed, leading to the machine's IP address being blocked for a while. Therefore, we must wait for a period longer than 30 minutes or use proxies or VPNs to work around the issue. Today, web scraping is complex and only feasible on a small scale. If we want to do it on a large scale, we need to use several VPNs and switch between them to make it undetectable.
Bing Search API
The Bing search is interesting as it is a search engine that stands out, and because we simply wanted to compare it to another popular search engine, using Bing search is an option.
Cons
However, after examining the pricing, I discovered that only 1,000 transactions are free per month. This equals approximately 33 free requests per day. 2 requests are needed for a single testing (QnA) file.
Tasks
Closes
closes #6 closes #11 closes #10
Alternative considered
Google API
Since Google Search Api limits results by 10 at a time and we have at least 20 files, each needing 100 results for testing, we will need at least 200 requests to obtain all the answers. Google only offers 100 free requests by day. After $5 are charged per 1000 requests, around $1 per test.
Library tested
google-api-python-client
Issues consulted
Why Does The Google Search API Disallow More Than 100 Results? How Can I Get More?
Google web scrapping
Web scraping has been attempted as it allows for querying completely free of charge. However, Google has incorporated stringent security measures to limit the number of requests. Since Google displays only 10 results at a time and we have at least 20 files, each needing 100 results for testing, we will need at least 200 Google requests to obtain all the answers. Even with time delays, the 200 requests never succeed, leading to the machine's IP address being blocked for a while. Therefore, we must wait for a period longer than 30 minutes or use proxies or VPNs to work around the issue. Today, web scraping is complex and only feasible on a small scale. If we want to do it on a large scale, we need to use several VPNs and switch between them to make it undetectable.
Libraries tested
abenassi Google-Search-API Nv7-GitHub googlesearch
Issues consulted
Github issue. How to fix python requests module 429 error for google search? Error 429 with simple query on google with requests python
Problem encountered