ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
132.55k stars 10.05k forks source link

New extractor for sanmarinortv.sm #31265

Open itec78 opened 2 years ago

itec78 commented 2 years ago

Checklist

Example URLs

Description

An extractor for RTV San Marino (https://www.sanmarinortv.sm/) is missing. I found that video address starts with: https://d2hrvno5bw6tg2.cloudfront.net/s3-player/, probably it's a common service used by other extractors.

Thanks a lot

dirkf commented 2 years ago

Looks like we can find the elements with class wowzaplayer. There's one with id="livePlayerElement" that appears to be the live channel and one with id matching r'playerElementVideo\d+' that should be the page video. In the latter we find JS like var playerElementVideo89005=fluidPlayerCreate('playerElementVideo89005',{...});; the media links are HLS and SMIL manifests in a list that is the value of the .sources member of the braced object:

[
    {
      'src': 'https://d2hrvno5bw6tg2.cloudfront.net/s3-player/_definst_/smil:catchup/00095439_SRV_DON_ANTONIO_MUSICA__14.smil/playlist.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9kMmhydm5vNWJ3NnRnMi5jbG91ZGZyb250Lm5ldC9zMy1wbGF5ZXIvX2RlZmluc3RfL3NtaWw6Y2F0Y2h1cC8wMDA5NTQzOV9TUlZfRE9OX0FOVE9OSU9fTVVTSUNBX18xNC5zbWlsL3BsYXlsaXN0Lm0zdTgiLCJDb25kaXRpb24iOnsiSXBBZGRyZXNzIjp7IkFXUzpTb3VyY2VJcCI6IjAuMC4wLjAvMCJ9LCJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTY2NDUzOTUwNDU2NH19fV19&Key-Pair-Id=APKAJXRZCR3BYC6OEKKA&Signature=o5EeC~6wJkP3eW9uVolfEDy7HzKYFVF4-1pGpPgtsNcNhnw7sebnrBX0RRexMZxpcRU0V4tCtV8YY-V9jcz7Z9J6FEqJKsymxJa4F2efY84bSXHt9btCbjalIj7AKgx38sukx3-472mxe7BgLpXvOGHndxeeexlMoHkTwNBbqlhKa7krn0rKRxC7CYPXWPml8335G9XRj9pw4phm6u6bnn0MCtmlvtYpxs3vLArhxJXXKSiYtwHSJ0FymbLQRawsHhzPl5E4i8WOf6PthXqYtnDQLHGAx6FzUaQYvgAt4zvT4mN-Bfs1avNuTl93MSwyO2LO5MSslA62lCdg3weZyg__',
      'type': 'application/x-mpegURL',
      'title': 'Auto'
    },
    {
      'src': 'https://d2hrvno5bw6tg2.cloudfront.net/s3-player/_definst_/mp4:catchup/00095439_SRV_DON_ANTONIO_MUSICA__14_H.mp4/playlist.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9kMmhydm5vNWJ3NnRnMi5jbG91ZGZyb250Lm5ldC9zMy1wbGF5ZXIvX2RlZmluc3RfL21wNDpjYXRjaHVwLzAwMDk1NDM5X1NSVl9ET05fQU5UT05JT19NVVNJQ0FfXzE0X0gubXA0L3BsYXlsaXN0Lm0zdTgiLCJDb25kaXRpb24iOnsiSXBBZGRyZXNzIjp7IkFXUzpTb3VyY2VJcCI6IjAuMC4wLjAvMCJ9LCJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTY2NDUzOTUwNDU2Nn19fV19&Key-Pair-Id=APKAJXRZCR3BYC6OEKKA&Signature=wzfZkTeZfLxIRDqeO2DTK57Gid6ybEPil9WVKMQtHYxQQyeVfWnDREueoZ-QslHdqRjU3LRaL617nH9WKkF8Uj93ac4XP7SnC7555YIF7YD5LAKLAmdy6fdz-vCO2EJVp0AWRMFR791oenV7-MnVNHWbwierKXunH2g1UxOcsuy41OWx~w~3G052yDcYIkGfzZ1mCVKUNa9bAZh9SjmIelHwsGJUwxA7AsA4dnwSOPyXDDmbjPNUjHiGeSVjWQRbWIg~gNKVycgqJcB32Li6dcDgaQUl9ihdvqZTxbf44Mg9EPB01QGrL-Iy0Dlg7XBHTxr0-i~D5MLWlM1EEjGQOQ__',
      'type': 'application/x-mpegURL',
      'title': 'High quality'
    },
    {
      'src': 'https://d2hrvno5bw6tg2.cloudfront.net/s3-player/_definst_/mp4:catchup/00095439_SRV_DON_ANTONIO_MUSICA__14_M.mp4/playlist.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9kMmhydm5vNWJ3NnRnMi5jbG91ZGZyb250Lm5ldC9zMy1wbGF5ZXIvX2RlZmluc3RfL21wNDpjYXRjaHVwLzAwMDk1NDM5X1NSVl9ET05fQU5UT05JT19NVVNJQ0FfXzE0X00ubXA0L3BsYXlsaXN0Lm0zdTgiLCJDb25kaXRpb24iOnsiSXBBZGRyZXNzIjp7IkFXUzpTb3VyY2VJcCI6IjAuMC4wLjAvMCJ9LCJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTY2NDUzOTUwNDU3MX19fV19&Key-Pair-Id=APKAJXRZCR3BYC6OEKKA&Signature=BMtmv-EGOoT4j9mvPY3EuGY2ULV8v6MON6I0S0txN51~dtZFfTbMKGdQ9Ru-Q9~NXr1~vhyEbNpApcho7XwJGj1DMwRGElVVoKBZphX2w1A0Jw6n3yzYiCpoI3EncxR2LPUXuyuxkeBkwVyagbPOJ-eZQKxGT0539hRKBJVZEz6dIlBIy0OKPPsd4bR0sk0XFekkkwulQ2JQHueYMj1WymnBAqYwKxNQ0YBVfAftQ6PhgCkxtWGDWPTZUuVrDIB3KCO2IY3q1~y5W1A1KDGYExRXMSDWwmMroByirydRdLvca-dlggK698c~t4K4pAtZC7ZvVsaVWGd7z6HmkGFM8w__',
      'type': 'application/x-mpegURL',
      'title': 'Medium quality'
    },
    {
      'src': 'https://d2hrvno5bw6tg2.cloudfront.net/s3-player/_definst_/mp4:catchup/00095439_SRV_DON_ANTONIO_MUSICA__14.mp4/playlist.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9kMmhydm5vNWJ3NnRnMi5jbG91ZGZyb250Lm5ldC9zMy1wbGF5ZXIvX2RlZmluc3RfL21wNDpjYXRjaHVwLzAwMDk1NDM5X1NSVl9ET05fQU5UT05JT19NVVNJQ0FfXzE0Lm1wNC9wbGF5bGlzdC5tM3U4IiwiQ29uZGl0aW9uIjp7IklwQWRkcmVzcyI6eyJBV1M6U291cmNlSXAiOiIwLjAuMC4wLzAifSwiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2NjQ1Mzk1MDQ1NzZ9fX1dfQ__&Key-Pair-Id=APKAJXRZCR3BYC6OEKKA&Signature=IuaW122C-r5ThWWNw-0Jj9QDlDimogrUCnplfpVBlYgkY9o0zF2TGSQIlO-I8V-3UOhpVzTNKeabTgj8nvtwibIha644sJuOghYLcHYqd9qHQMB3TA7N4TTZQXJWNN3K6AYxXP3JO7nxApeVMzSLitL4aAfp3GRsI8aG-2gHiuX4LDHhwjSd84k1trlVo4H1I-jEQZgP-L7PuY8jgyIkuwC3WcWzXT65vWvphjPIMbaE0DOcRsoOLrcCNQBJEJ5HEtgL0ev-rWO9HjP26NlCJHuiBsEstHitdXKKr6XzeZ1Wj9iq9u9~SfjLzPbBzFvK06EEUPjQ6lh6CfVFAOXTCw__',
      'type': 'application/x-mpegURL',
      'title': 'Low quality'
    }
  ]

There's a ld+json block with a NewsArticle but better metadata in the <meta> elements.

dirkf commented 2 years ago

Promising:

$ python -m youtube_dl -v -F 'https://www.sanmarinortv.sm/news/cultura-c6/il-sacro-don-gramentieri-in-colorama-a230079'
[debug] System config: [u'--prefer-ffmpeg']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'-F', u'https://www.sanmarinortv.sm/news/cultura-c6/il-sacro-don-gramentieri-in-colorama-a230079']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: f838f3805
[debug] Python version 2.7.17 (CPython) - Linux-4.4.0-210-generic-i686-with-Ubuntu-16.04-xenial
[debug] exe versions: avconv 4.3, avprobe 4.3, ffmpeg 4.3, ffprobe 4.3
[debug] Proxy map: {}
[SMRTV] a230079: Downloading webpage
[SMRTV] a230079: Downloading player sources
[SMRTV] a230079: Downloading m3u8 information
[SMRTV] a230079: Downloading m3u8 information
[SMRTV] a230079: Downloading m3u8 information
[SMRTV] a230079: Downloading m3u8 information
[info] Available formats for a230079:
format code          extension  resolution note
Auto-542             mp4        384x216     542k 
Low_quality-575      mp4        384x216     575k , avc1.100.13, mp4a.40.2
Auto-1092            mp4        640x360    1092k 
Medium_quality-1119  mp4        640x360    1119k , avc1.100.30, mp4a.40.2
Auto-1692            mp4        1272x720   1692k 
High_quality-1983    mp4        1920x1080  1983k , avc1.100.40, mp4a.40.2 (best)
$