rivermont / spidy

The simple, easy to use command line web crawler.
GNU General Public License v3.0
334 stars 69 forks source link

Update for Python 3.9+, deprecate Reppy dependency #89

Open rivermont opened 1 year ago

rivermont commented 1 year ago

Reppy doesn't work past Python 3.8 - seomoz/reppy#122, seomoz/reppy#132 - which means our robots.txt parser isn't working (#81). Python 3.8 also reaches end-of-life next year so this needs to happen anyway.

trs-eric commented 1 year ago

Confirmed on Debian 12.

rivermont commented 11 months ago

Will replace reppy with scrapy/protego