ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.91k stars 10.01k forks source link

Skillshare #31427

Open slavakurilyak opened 1 year ago

slavakurilyak commented 1 year ago

Checklist

Example URLs

Description

I would like to see a new extractor for Skillshare so I can make backups of my classes (ex: videos)

dirkf commented 1 year ago

Same request: https://github.com/yt-dlp/yt-dlp/issues/5813

dirkf commented 1 year ago

The free Intro class page has useful data in its <meta> tags, which strangely are placed in the <body> rather than the <head>.

Metadata from the og: properties:

Video data from the twitter:player properties:

If the subscriber videos have the same structure, just passing cookies from a logged-in browser session using --cookies ... would give access using the same extraction. However the subscriber video pages may be more complex.

In fact the generic extractor should handle the free videos since the twitter:player:stream property is found. The extractor rejects it because it has no extension, but as the content_type is provided that should be enough. Something like this:

             # twitter:player:stream should be checked before twitter:player since
             # it is expected to contain a raw stream (see
             # https://dev.twitter.com/cards/types/player#On_twitter.com_via_desktop_browser)
-            found = filter_video(re.findall(
-                r'<meta (?:property|name)="twitter:player:stream" (?:content|value)="(.+?)"', webpage))
+            found = re.findall(
+                r'<meta (?:property|name)="twitter:player:stream" (?:content|value)="(.+?)"', webpage)
+            if found:
+                ext = mimetype2ext(get_first(re.findall(
+                    r'<meta (?:property|name)="twitter:player:stream:content_type" (?:content|value)="(.+?)"', webpage), []))
+                found = found[:1] if ext else filter_video(found)
         if not found:
             # We look for Open Graph info:
             # We have to match any number spaces between elements, some sites try to align them (eg.: statigr.am)
slavakurilyak commented 1 year ago

Any updates on this extractor?

slavakurilyak commented 1 year ago

Related to https://github.com/ytdl-org/youtube-dl/issues/9769