privacy-tech-lab / gpc-web-crawler

Web crawler for detecting websites' compliance with GPC privacy preference signals at scale
https://privacytechlab.org/
MIT License
4 stars 2 forks source link

Perform October Crawl #118

Open franciscawijaya opened 3 months ago

franciscawijaya commented 3 months ago

I will be performing July Crawl. Taking into account the duration of June crawl and other upcoming tasks in the month of July, I am planning to start the Crawl around 8th of July so that we would have ample time for any possible re-do (if any of the batch fails), analysis, creating figures and provide the results to the planned datasets for community resource. After July Crawl is done, I would then move on to do other tasks for July.

SebastianZimmeck commented 3 months ago

I am planning to start the Crawl around 8th of July

If it fits with your planning, I'd suggest giving it another week starting the crawl around July 15. The reason is that the time period between crawls is usually two months. Adding one more week in this case keeps the time periods a bit more aligned. The more time between crawls, the more likely we will see differences.

SebastianZimmeck commented 1 month ago

Given that we first need to address #122, the goal is to start the next crawl by September 2.

SebastianZimmeck commented 3 weeks ago

We should start the crawl this week or next.

SebastianZimmeck commented 2 weeks ago

We should crawl for both California and Colorado.

SebastianZimmeck commented 2 weeks ago

@franciscawijaya will start the California crawl this week and afterwards @natelevinson10 will start the Colorado crawl.

SebastianZimmeck commented 2 weeks ago

Adding @eakubilo as special advisor. 😄