blackjack4494 / youtube-dlc

Command-line program to download various media from YouTube.com and other sites
https://blackjack4494.github.io/youtube-dlc/
The Unlicense
1.22k stars 13 forks source link

[Broken] UniPD's Kaltura doesn't work anymore #169

Closed ZinRicky closed 4 years ago

ZinRicky commented 4 years ago

Checklist

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://mediaspace.unipd.it/media/1_7396ji5a', '--cookies', 'cookies.txt', '-F', '-v']
[debug] Loading archive file None
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252
[debug] youtube-dlc version 2020.09.30
[debug] Python version 3.8.2 (CPython) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 4.2
[debug] Proxy map: {}
[generic] 1_7396ji5a: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 1_7396ji5a: Downloading webpage
[generic] 1_7396ji5a: Extracting information
ERROR: Unsupported URL: https://mediaspace.unipd.it/media/1_7396ji5a
Traceback (most recent call last):
  File "c:\users\ricca\appdata\local\programs\python\python38\lib\site-packages\youtube_dlc\YoutubeDL.py", line 830, in extract_info
    ie_result = ie.extract(url)
  File "c:\users\ricca\appdata\local\programs\python\python38\lib\site-packages\youtube_dlc\extractor\common.py", line 532, in extract
    ie_result = self._real_extract(url)
  File "c:\users\ricca\appdata\local\programs\python\python38\lib\site-packages\youtube_dlc\extractor\generic.py", line 3382, in _real_extract
    raise UnsupportedError(url)
youtube_dlc.utils.UnsupportedError: Unsupported URL: https://mediaspace.unipd.it/media/1_7396ji5a

Description

The UniPD Kaltura instance was fine until this summer; now when I try to download any video, the program doesn't seem to recognise it's a Kaltura-type website. I opened an issue for youtube-dl too.

blackjack4494 commented 4 years ago

@ZinRicky the video you linked in youtube-dl issue

PS D:\Workspace\garbage> python3 -m youtube_dlc "https://mediaspace.unipd.it/media/1_5nlzdr0p" -v [debug] System config: [] [debug] Command-line args: ['https://mediaspace.unipd.it/media/1_5nlzdr0p', '-v'] [debug] Loading archive file None [debug] Encodings: locale cp65001, fs utf-8, out utf-8, pref cp65001 [debug] youtube-dlc version 2020.09.30 [debug] Python version 3.6.5 (CPython) - Windows-10-10.0.19041-SP0 [debug] exe versions: ffmpeg 4.2.1, ffprobe 4.2.1 [debug] Proxy map: {} [generic] 1_5nlzdr0p: Requesting header WARNING: Falling back on generic information extractor. [generic] 1_5nlzdr0p: Downloading webpage [generic] 1_5nlzdr0p: Extracting information [debug] Default format spec: bestvideo+bestaudio/best [debug] Invoking downloader on 'https://cdnapisec.kaltura.com/p/2203921/sp/220392100/playManifest/entryId/1_5nlzdr0p/flavorId/1_vvb3gujx/format/url/protocol/http/a.mp4' [download] Destination: 28-9-1-a.mp4 [download] 100% of 43.05MiB in 00:02

However the one you linked here seems to need some login. But I am still able to manually extract the video.
Since the url contains the entry_id which is needed for kaltura to extract the video. https://cdnapisec.kaltura.com/p/2203921/sp/220392100/playManifest/entryId/1_7396ji5a/

You will need a dedicated extractor for this. If you want you can create your own.
All you need is a basic extractor that redirects to kaltura.

Have a look at this extractor https://github.com/blackjack4494/youtube-dlc/blob/3d6a47d35f79d0f1705ee1a64e855bcb82c6e6c3/youtube_dlc/extractor/tmz.py#L7-L28

Change the valid url regex to be compatible with your site.
The use kaltura:2203921:%s because partner id of your site is different of course.

If you think you cannot do it on your own let me know. Otherwise feel free to try it and open a PR :)

ZinRicky commented 4 years ago

Thanks for the answer: it is really on point. I'm afraid I may not have the skills nor the time to implement the extractor in order to use it "normally"; however I managed to download the videos by feeding youtube-dlc the m3u8 file found on the Network tab when you press F12 on Firefox — I saw this "trick" on a generic tutorial. I think this is somehow related to your solution, but as I said, I may not be skilled enough.

blackjack4494 commented 4 years ago

@ZinRicky easiest workaround for you is simply to use youtube-dlc kaltura:2203921:ID
where ID is the last part of the url https://mediaspace.unipd.it/media/1_7396ji5a so 1_7396ji5a in this case.
youtube-dlc kaltura:2203921:1_7396ji5a

ZinRicky commented 4 years ago

@ZinRicky easiest workaround for you is simply to use youtube-dlc kaltura:2203921:ID where ID is the last part of the url https://mediaspace.unipd.it/media/1_7396ji5a so 1_7396ji5a in this case. youtube-dlc kaltura:2203921:1_7396ji5a

Wow, that's really cool! It worked as intended. Should I close this issue now?

blackjack4494 commented 4 years ago

@ZinRicky easiest workaround for you is simply to use youtube-dlc kaltura:2203921:ID where ID is the last part of the url https://mediaspace.unipd.it/media/1_7396ji5a so 1_7396ji5a in this case. youtube-dlc kaltura:2203921:1_7396ji5a

Wow, that's really cool! It worked as intended. Should I close this issue now?

Well. I put the Future label on it. I dunno if it's really worth to make a dedicated extractor for it. No clue if it can be done without major effort or tweaking kaltura generic extractor a bit.

Feel free to further discuss or ask questions even tho it's closed.