Since April using the S3 API to access data from the Amazon cloud requires authentication. So unsigned access to the CommonCrawl is disabled, therefore the _downloadpages.py script is not working because of the unsigned config.
Removing the Config is enough to make it work.
@Baaart25 Thanks for reporting, I have just discovered this myself. A fix is in the works, but I plan to ditch AWS if favor of CloudFront so that we don't need boto anymore.
Since April using the S3 API to access data from the Amazon cloud requires authentication. So unsigned access to the CommonCrawl is disabled, therefore the _downloadpages.py script is not working because of the unsigned config. Removing the Config is enough to make it work.
https://commoncrawl.org/2022/03/introducing-cloudfront-access-to-common-crawl-data/