ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
132.56k stars 10.05k forks source link

[vyborymos] Fix extraction #22337

Open nmr50 opened 5 years ago

nmr50 commented 5 years ago

and some other urls might be used with 1st URLs above: https://msk-cache-4-1-h.cdn.vybory.mos.ru/aes128-key/26132077.key?sid=e252c26a-a563-11e8-812f-00259057913e&kid=short-token-1&exp=1567924828&dig=d0b62bbe94a7588714ed6eaed5cd459f

https://msk-cache-4-1-h.cdn.vybory.mos.ru/hls/e252c26a-a563-11e8-812f-00259057913e/1567924622.95-1567924637.94.ts?input=ege-production&kid=short-token-1&exp=1567924828&dig=dad56113ed8dfd4580462d1afd77f395

https://ls-pub.cdn.vybory.mos.ru/stat?station_id=28102&user_id=0&errorlevel=0&adapter_type=hlsjs&token=eyJhbGciOiJIUzI1NiJ9.eyJkYXRhIjoyODEwMn0.wb9eRieii7IFKxCbtqJlrd0sY8MG06uz24uXGhv6YG0&uuid=16ca91e0-d203-11e9-aa9c-c7f44acda033&mobile=0

R:>youtube-dl.exe -v -F https://msk-cache-4-1-h.cdn.vybory.mos.ru/master.m3u8?sid=e252c26a-a563-11e8-812f-00259057913e [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['-v', '-F', 'https://msk-cache-4-1-h.cdn.vybory.mos.ru/master.m3u8?sid=e252c26a-a563-11e8-812f-00259057913e'] [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] youtube-dl version 2019.09.01 [debug] Python version 3.4.4 (CPython) - Windows-7-6.1.7601-SP1 [debug] exe versions: ffmpeg 4.2, ffprobe 4.2 [debug] Proxy map: {} [generic] master: Requesting header WARNING: Could not send HEAD request to https://msk-cache-4-1-h.cdn.vybory.mos.ru/master.m3u8?sid=e252c26a-a563-11e8-812f-00259057913e: HTTP Error 400: Bad Request [generic] master: Downloading webpage ERROR: Unable to download webpage: HTTP Error 400: Bad Request (caused by HTTPError()); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpbzyg5d3a\build\youtube_dl\extractor\common.py", line 627, in _request_webpage File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpbzyg5d3a\build\youtube_dl\YoutubeDL.py", line 2229, in urlopen File "C:\Python\Python34\lib\urllib\request.py", line 470, in open File "C:\Python\Python34\lib\urllib\request.py", line 580, in http_response File "C:\Python\Python34\lib\urllib\request.py", line 508, in error File "C:\Python\Python34\lib\urllib\request.py", line 442, in _call_chain File "C:\Python\Python34\lib\urllib\request.py", line 588, in http_error_default

No fresh ydl, neither ffmpeg/ffprobe 4.2 can get stream from those URLs... might be there are some catch with IP checking or something like that, Please investigate and make update for ydl - so it will be able to get video stream from needed election points !:)

--- Today, 2019.09.08, we have here in Moscow local elections, yet we have webcams installed on every election point, so I really want to get some stream for a history:) Thanks in advance!!! -t

PS. Friend of mine has found that code https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/vyborymos.py can be used to grab videostream from mentioned URLs that I've posted. Please fix that vyborymos.py with new stuff that is compatible with current vybory.mos.ru site ! Thanks in advance !

Forst commented 5 years ago

Current code in youtube_dl/extractor/vyborymos.py is obsolete, needs to be rewritten.


Below is the typical flow to acquire a working HLS manifest.

Step 0. Source URL

URL format: https://vybory.mos.ru/voting-stations/STATION_ID?channel=CAMERA_INDEX

Input:

Example: https://vybory.mos.ru/voting-stations/25319?channel=0

Step 1. Voting station list

curl "https://vybory.mos.ru/api/voting_stations.json"

Input: none

Output:

Example:

curl "https://vybory.mos.ru/api/voting_stations.json"
[
  {
    "region_id": 1,
    "voting_stations": [
      {
        "full_address": "Центральный АО, Шубинский переулок, дом 6, строение 1",
        "id": 25319,
        "rid": "B77K0001",
        "parent_rid": null,
        "kind": "УИК",
        "number": 1,
        "name": "",
        "people": 1750,
        "address": "Шубинский переулок, дом 6, строение 1",
        "region_id": 1,
        "region_name": "Центральный АО",
        "utc_offset": 180,
        "is_active": true,
        "is_standalone": false,
        "latitude": 55.746802,
        "longitude": 37.577631,
        "broadcast_state": 1,
        "broadcast_state_updated_at": "2019-09-08T04:00:49.711297",
        "federal_contract": false
      },
      "more voting stations here"
    ]
  },
  "more regions here"
]

Step 2. Voting station data

curl "https://vybory.mos.ru/api/channels/REGION_ID/STATION_ID.json"

Input:

Output:

Example:

curl "https://vybory.mos.ru/api/channels/1/25319.json"
[
  {
    "video_enabled": true,
    "utc_offset": 180,
    "session_id": "<removed for privacy>",
    "view": "H01",
    "voting_station_id": 25319,
    "streamers_hls": [
      "msk-cache-3-1-h.cdn.vybory.mos.ru",
      "msk-cache-3-2-h.cdn.vybory.mos.ru",
      "msk-cache-3-3-h.cdn.vybory.mos.ru",
      "msk-cache-3-4-h.cdn.vybory.mos.ru"
    ],
    "uuid": "d950691a-a563-11e8-812f-00259057913e",
    "camera_number": 1,
    "name": "Камера 1",
    "kind": "УИК",
    "region_id": 1,
    "vrid": "B77K0001A0001H01",
    "demand_token": "<removed for privacy>"
  },
  "more cameras"
]

Step 3. Tokenizer session token

curl -X "POST" "https://vybory.mos.ru/tokenizer/session"

Input: none

Output:

Example:

curl -X "POST" "https://vybory.mos.ru/tokenizer/session"
{
  "data": "phzRvCTW…"
}

Step 4. Streaming session token

curl -X "POST" "https://vybory.mos.ru/tokenizer/tokens" \
     -H 'Authorization: TOKENIZER_TOKEN' \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{"channel_id": "CAMERA_UUID"}'

Input:

Output:

Note: STREAM_TOKEN has an expiration time, usually 5 minutes, and has to be acquired again.

Example:

curl -X "POST" "https://vybory.mos.ru/tokenizer/tokens" \
     -H 'Authorization: phzRvCTW…' \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{"channel_id": "d950691a-a563-11e8-812f-00259057913e"}'
{
  "data": "eyJhbGci…"
}

5. HLS manifest

curl "https://STREAM_HOST/master.m3u8?sid=CAMERA_UUID" \
     -H 'Cookie: _t_CAMERA_UUID=STREAM_TOKEN'

Input:

Note: alternatively, STREAM_TOKEN may be specified as session_id URL query parameter instead of in a cookie.

Example:

curl "https://msk-cache-5-2-h.cdn.vybory.mos.ru/master.m3u8?sid=d950691a-a563-11e8-812f-00259057913e" \
     -H 'Cookie: _t_e0a4092e-a563-11e8-812f-00259057913e=eyJhbGci…'
#EXTM3U
#EXT-X-VERSION:2
#EXT-X-ALLOW-CACHE:YES
#EXT-X-MEDIA-SEQUENCE:10190
#EXT-X-TARGETDURATION:15

#EXT-X-KEY:METHOD=AES-128,URI="/aes128-key/26132504.key?sid=d950691a-a563-11e8-812f-00259057913e&kid=short-token-1&exp=1567950661&dig=fe12…",IV=0x00000000000000000000000000000000

#EXTINF:15,
/hls/d950691a-a563-11e8-812f-00259057913e/1567950287.71-1567950302.71.ts?input=ege-production&kid=short-token-1&exp=1567950661&dig=2e68…

# and so on