manolomartinez / greg

A command-line podcast aggregator
GNU General Public License v3.0
299 stars 37 forks source link

Not following 302 redirects #98

Closed battis closed 4 years ago

battis commented 4 years ago

It appears that greg isn't following 302 redirects in feeds. The particular case in point that I'm seeing is The Bugle. For the purposes of illustration, let's take the most reasonable, but also the most redirected version of the situation.

Subscribe to http://feeds.feedburner.com/thebuglefeed. All of the media attachments are actually hosted by Acast. For example, episode 4162 has a media enclosure link of https://feeds.acast.com/public/streams/5e7b777ba085cbe7192b0607/episodes/5f341b083120c8645db8bc9b.mp3, which generates an error from greg:

Downloading 4162 - Bond, Boris and Boats -- 5f341b083120c8645db8bc9b.mp3
... something went wrong. Are you connected to the internet?

I did some digging with curl, and I observe that Acast seems to be doing load-balancing of some sort, since this request:

curl -v https://feeds.acast.com/public/streams/5e7b777ba085cbe7192b0607/episodes/5f341b083120c8645db8bc9b.mp3

generates the following response

GET /public/streams/5e7b777ba085cbe7192b0607/episodes/5f341b083120c8645db8bc9b.mp3 HTTP/2 Host: feeds.acast.com User-Agent: curl/7.54.0 Accept: /

  • Connection state changed (MAX_CONCURRENT_STREAMS updated)! < HTTP/2 302 < content-length: 0 < location: https://assets-do.pippa.io/shows/5e7b777ba085cbe7192b0607/1597249367402-89f22c0f843255c028c73b988db40a2b.mp3 < access-control-allow-methods: GET,OPTIONS < access-control-allow-origin: * < cache-control: no-cache < date: Sat, 15 Aug 2020 14:31:08 GMT < server: nginx/1.14.0 (Ubuntu) < x-frame-options: sameorigin < x-request-id: Z+Blgq8rS/J6XdG3 < x-cache: Miss from cloudfront < via: 1.1 980d2a1c9c4f90ad69118c6357f92882.cloudfront.net (CloudFront) < x-amz-cf-pop: BOS50-C1 < x-amz-cf-id: CPTpFkQg9zvXkfwFfDm9QwBdC0QGz7I-L676xZnEcUilbsOnM-TsUA== <
  • Connection #0 to host feeds.acast.com left intact

When I follow the 302 redirect location, I do get the actual file.

battis commented 4 years ago

Huh. So... while this is all technically true, a little more digging reveals that this is a great use case for the download handler option in greg.conf:

[The Bugle]
downloadhandler = wget {link} -P {directory}