ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
129.85k stars 9.8k forks source link

Support PBS Kids #6515

Open asmyers opened 8 years ago

asmyers commented 8 years ago

Would it be possible to support pbskids.org shows? For instance, http://pbskids.org/catinthehat/video/index.html

josefwells commented 8 years ago

PBS Kids support would be great!

eremini commented 8 years ago

+1 for this one

mtucker502 commented 7 years ago

+1

gkoelln commented 6 years ago

I know it's been a while since this support request, but I've been working on this for myself and wanted some feedback. Do we want to implement this for all videos, including clips, or just full episodes? For example, if I do this for http://pbskids.org/wildkratts/videos/ there are three full-length episodes and over three hundred clips. That's a huge playlist!

asmyers commented 6 years ago

I would just want full episodes.

gkoelln commented 6 years ago

I tend to agree.

This is taking a bit longer than expected, since the shows' pages aren't consistent in how they present metadata. So far, I got it working with Wild Kratts, but I'll have to do some more testing with other sites.

syncretic commented 6 years ago

If you hit f12 in chrome, network tab, filter for m3u8, and play the video, you can find the m3u8 url for that video. Youtube-dl will then download the mp4 video using the m3u8 url. You can find the subtitles by filtering for "vtt" and then playing the video. The .vtt file can be converted to .srt with Subtitle Edit, and then you can mux the .mp4 and .srt files together with MKVToolNix.

This is a manual workaround until proper support is added.

beren12 commented 5 years ago

sounds good, my son loves dinosaur train, curious george, arthur, etc. has anyone made progress on this?

elibarb commented 5 years ago

There's a DEEPLINK variable you can inspect to find what you're after.

window._PBS_KIDS_DEEPLINK = {"show_slug":"arthur","video_id":"1447843659","video_obj":{"air_date":"2010-10-29T08:00:00-04:00","closed_captions":[{"URI":"https:\/\/kids.video.cdn.pbs.org\/captions\/arthur\/84c20249-ea03-4e98-b83a-e60d501bdbf7\/captions\/566822_Encoded.sami","language":"en","format":"Caption-SAMI"},{"URI":"https:\/\/kids.video.cdn.pbs.org\/captions\/arthur\/84c20249-ea03-4e98-b83a-e60d501bdbf7\/captions\/566823_Encoded.dfxp","language":"en","format":"DFXP"},{"URI":"https:\/\/kids.video.cdn.pbs.org\/captions\/arthur\/84c20249-ea03-4e98-b83a-e60d501bdbf7\/captions\/566824_Encoded.srt","language":"en","format":"SRT"},{"URI":"https:\/\/kids.video.cdn.pbs.org\/captions\/arthur\/84c20249-ea03-4e98-b83a-e60d501bdbf7\/captions\/566825_Encoded.vtt","language":"en","format":"WebVTT"}],"duration":783,"expire_date":null,"id":"1447843659","mezzanine":"https:\/\/image.pbs.org\/video-assets\/pbs-kids\/arthur\/57327\/images\/Kids-Mezzannine-16x9_808.jpg","mp4":"https:\/\/urs.pbs.org\/redirect\/d5d073303cf44d54b955f9ea42fd06cc\/","title":"When Carl Met George","URI":"https:\/\/urs.pbs.org\/redirect\/46aba182f3d74f5b8fc88cc47c56067a\/","video_type":"Episode","description":"George is excited about spending time with his new friend, Carl, who seems to know all kind of cool facts about trains and about\u2026 well lots of things! Then George learns that Carl has Asperger's Syndrome - a form of autism that makes Carl see the world differently than most people. Can George and Carl remain good friends - and perhaps even learn from each other?","program_slug":"arthur","program_nola":"ARUR","program_title":"Arthur"}};

There's probably a better way, but I'm passing the pretty link through curl, using grep to search for DEEPLINK then I use awk to print out the field with this value, awk that output, then sed to pretty it up a little and remove the backslashes. Assuming all of the videos output this blob with the same number of fields I'll construct a script to iterate over each of the URLs I've saved to get a list of actual videos.

curl http://pbskids.org/video/arthur/1447843659 | grep DEEPLINK | awk -F "," '{print $20}'| awk -F "\"" '{print $4}'|sed "s/[\]//g"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  139k    0  139k    0     0  37959      0 --:--:--  0:00:03 --:--:-- 37954
https://urs.pbs.org/redirect/d5d073303cf44d54b955f9ea42fd06cc/
youtube-dl https://urs.pbs.org/redirect/d5d073303cf44d54b955f9ea42fd06cc/

Which redirects to http://kids.video.cdn.pbs.org/videos/arthur/84c20249-ea03-4e98-b83a-e60d501bdbf7/20465/hd-mezzanine-4x3/ARUR1306A_episode-4x3-mp4-2500k.mp4

ghost commented 4 years ago

+1 It's great to know that it's being worked on. PBS is more or less the only channel we watch on our OTA antenna.