manolomartinez / greg

A command-line podcast aggregator
GNU General Public License v3.0
296 stars 37 forks source link

403 errors (uncaught) on all Pinecast-hosted feeds #118

Closed n8willis closed 2 years ago

n8willis commented 2 years ago

Starting sometime mid-week this week, greg sync started crashing on any feed hosted at Pinecast. In interactive mode, the cause appears to be an HTTP error 403:

File "/usr/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

This is some sort of feed-handling error on greg's side, since I can verify that all of the feeds are correct and all of the media enclosures are there and all can be downloaded without incident from other podcast aggregators.

It happens after greg gets the feed and when it starts the attempted download.

Looking at an example feed (e.g., https://pinecast.com/feed/theleagueof )I do note the presence of & markup in the MP3 file URLs (after a ?) as well as = ... which seems like a potential starting point for investigation, since greg has historically had so many issues in string parsing. But I don't see anything else that looks suspicious in any of the affected feeds.

manolomartinez commented 2 years ago

It's not string parsing (this time), but urllib.urlopen not being smart enough. Using requests solves this, as the linked PR illustrates.

I should probably just make my life easier and switch to requests.