snarfed / huffduff-video

📺 Extract the audio from videos on YouTube, Vimeo, and other sites and send it to Huffduffer.
https://huffduff-video.snarfed.org/
95 stars 6 forks source link

403 from s3 bucket #21

Closed cmdln closed 7 years ago

cmdln commented 8 years ago

The last couple of videos I snagged, cmdln at huff duff, return 403 when I try to download them.

snarfed commented 8 years ago

sorry for the trouble! i was able to download a few just now ok. S3 403s if the file doesn't exist. were the videos you tried over 30d old? ie from when you originally used huffduff-video on them? huffduff-video deletes files after 30d, so if so, that might be it.

cmdln commented 8 years ago

Thanks for the fast response. I added these files yesterday so they should still be within the 30d window.

snarfed commented 8 years ago

could you post an example url?

cmdln commented 8 years ago

https://huffduffer.com/cmdln/364067 That's an entry in my feed whose download link results in the 403.

cmdln commented 8 years ago

I just added another video, the file for that works fine.

snarfed commented 8 years ago

whoa, yeah, that URL is unhappy. looks like it downloaded from https://www.oreilly.com/learning/how-do-you-subtract-two-dates-in-java, but the huffduff-video URL is way different:

https://huffduff-video.s3-us-west-2.amazonaws.com/kaltura_-1681692_0_mlq5pb0f_youtubedl_smuggle=%7B%22service_url%22%3A+%22https%3A%2F%2Fcdnapisec.kaltura.com%22%2C+%22source_url%22%3A+%22https%3A%2F%2Fwww.oreilly.com%2Flearning%2Fhow-do-you-subtract-two-

i'm trying again, just to see.

snarfed commented 8 years ago

aha, i see. the retry ended up with the same S3 URL. the actual file in S3 is doubly url escaped though:

https://s3-us-west-2.amazonaws.com/huffduff-video/kaltura_-1681692_0_mlq5pb0f_youtubedl_smuggle%3D%257B%2522service_url%2522%253A%2B%2522https%253A%252F%252Fcdnapisec.kaltura.com%2522%252C%2B%2522source_url%2522%253A%2B%2522https%253A%252F%252Fwww.oreilly.com%252Flearning%252Fhow-do-you-subtract-two-dates-in-java%2522%257D.mp3

thanks for reporting! i'll fix this soon.

cmdln commented 8 years ago

Fantastic, thanks!

snarfed commented 8 years ago

actually, i was wrong, it wasn't the encoding. it's that the URL was cut off at two-, but the S3 file ends with the complete two-dates-in-java%2522%257D.mp3.

snarfed commented 8 years ago

huffduffer is truncating the URL we send it to 255 chars.

cc @adactio. this is probably rare, but still a surprise, and would break anyone else with long URLs too. maybe consider relaxing it on huffduffer's end?

otherwise, easy enough for us to do that truncating ahead of time, so we're not surprised and the S3 URL is preserved.

adactio commented 8 years ago

Yeah, the URL field is limited to 255 characters.

In retrospect, that was a bad decision. :-(

I've doubled it. Hope that helps.

snarfed commented 7 years ago

tentatively closing. thanks all!