scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.
https://scrapy.org
BSD 3-Clause "New" or "Revised" License
50.99k stars 10.34k forks source link

Handle robots.txt files not UTF-8 encoded #6298

Closed lorenzoverardo closed 1 month ago

lorenzoverardo commented 1 month ago

Fixes https://github.com/scrapy/scrapy/issues/6292.

cc @Gallaecio

codecov[bot] commented 1 month ago

Codecov Report

Merging #6298 (7b37dcd) into master (02b97f9) will not change coverage. Report is 1 commits behind head on master. The diff coverage is 100.00%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #6298 +/- ## ======================================= Coverage 88.88% 88.88% ======================================= Files 161 161 Lines 11971 11971 Branches 1929 1929 ======================================= Hits 10640 10640 Misses 980 980 Partials 351 351 ``` | [Files](https://app.codecov.io/gh/scrapy/scrapy/pull/6298?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=scrapy) | Coverage Δ | | |---|---|---| | [scrapy/robotstxt.py](https://app.codecov.io/gh/scrapy/scrapy/pull/6298?src=pr&el=tree&filepath=scrapy%2Frobotstxt.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=scrapy#diff-c2NyYXB5L3JvYm90c3R4dC5weQ==) | `85.39% <100.00%> (ø)` | |
wRAR commented 1 month ago

Thanks!