ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.27k stars 9.94k forks source link

Include playlist description and view count when using youtube-dl with `--flat-playlist -J` on a YouTube playlist #30150

Open nose-gnome opened 2 years ago

nose-gnome commented 2 years ago

Checklist

Description

I have been trying to find a way to quickly get information and video URLs/titles from a YouTube playlist, so far I have found that using youtube-dl <playlist URL> -J --flatplaylist is the closest to what I want, returning a list of all videos as well as the uploader of the playlist and the playlist name.
However, it would be extremely useful if it could also return even more information about the playlist, like the playlist's Description & view count.

dirkf commented 2 years ago

If this information is extracted, it's included:

$ youtube-dl --flat-playlist -J 'https://www.youtube.com/c/3blue1brown/playlists?view=50&sort=dd&shelf_id=3'

gives

{
  'webpage_url_basename': 'playlists',
  'extractor': 'youtube:tab',
  'entries': [
    {
      '_type': 'url',
      'id': 'PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab',
      'title': 'Essence of linear algebra',
      'ie_key': 'YoutubeTab',
      'url': 'https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab'
    },
    {
      '_type': 'url',
      'id': 'PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr',
      'title': 'Essence of calculus',
      'ie_key': 'YoutubeTab',
      'url': 'https://www.youtube.com/playlist?list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr'
    },
    {
      '_type': 'url',
      'id': 'PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi',
      'title': 'Neural networks',
      'ie_key': 'YoutubeTab',
      'url': 'https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi'
    },
    {
      '_type': 'url',
      'id': 'PLZHQObOWTQDNPOjrT6KVlfJuKtYTftqH6',
      'title': 'Differential equations',
      'ie_key': 'YoutubeTab',
      'url': 'https://www.youtube.com/playlist?list=PLZHQObOWTQDNPOjrT6KVlfJuKtYTftqH6'
    },
    {
      '_type': 'url',
      'id': 'PLZHQObOWTQDP5CVelJJ1bNDouqrAhVPev',
      'title': 'Lockdown math',
      'ie_key': 'YoutubeTab',
      'url': 'https://www.youtube.com/playlist?list=PLZHQObOWTQDP5CVelJJ1bNDouqrAhVPev'
    }
  ],
  'title': '3Blue1Brown - Playlists',
  'description': '3Blue1Brown, by Grant Sanderson, is some combination of math and entertainment, depending on your disposition. The goal is for explanations to be driven by animations and for difficult problems to be made simple with changes in perspective.\n\nFor more information, other projects, FAQs, and inquiries see the website: https://www.3blue1brown.com',
  '_type': 'playlist',
  'id': 'UCYO_jab_esuFRV4b17AJtAw',
  'extractor_key': 'YoutubeTab',
  'webpage_url': 'https://www.youtube.com/c/3blue1brown/playlists?view=50&sort=dd&shelf_id=3'
}
nose-gnome commented 2 years ago
'https://www.youtube.com/c/3blue1brown/playlists?view=50&sort=dd&shelf_id=3'

Hello, thank you for your response. What you have shown me displays the channel description and list of all public playlists they have made; I'm asking for a way to obtain the description and viewcount of the playlist itself. This does not appear to be possible with the command you showed me, please advise if I am wrong.

dirkf commented 2 years ago

It seems that description is extracted for tab-type playlists (as in the example posted above) but not for playlist-type playlists.

This patch corrects that:

--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -2785,6 +2785,7 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
                 data, lambda x: x['metadata']['playlistMetadataRenderer'], dic
t)
             if renderer:
                 title = renderer.get('title')
+                description = renderer.get('description')
             else:
                 renderer = try_get(
                     data, lambda x: x['header']['hashtagHeaderRenderer'], dict
)

Then:

youtube-dl --flat-playlist -J PLBB231211A4F62143

->

{
  'extractor': 'youtube:tab',
  '_type': 'playlist',
  'description': 'My outdated Let\'s Play series of Team Fortress 2. Showcasing each class on basic strategies and weapons (as of May 2010).',
  'title': '[OLD]Team Fortress 2 (Class-based LP)',
  'extractor_key': 'YoutubeTab',
  'uploader_id': 'UCKSpbfbl5kRQpTdL7kMc-1Q',
  'uploader_url': 'https://www.youtube.com/user/Wickydoo',
  'webpage_url': 'https://www.youtube.com/playlist?list=PLBB231211A4F62143',
  'uploader': 'Wickman',
  'entries': [
    {
      'duration': 633,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Intro',
      'url': 'RlJy2Kt8Be0',
      'view_count': null,
      'id': 'RlJy2Kt8Be0'
    },
    {
      'duration': 622,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Pyro (1/3)',
      'url': 'wRGMlG96Gwg',
      'view_count': null,
      'id': 'wRGMlG96Gwg'
    },
    {
      'duration': 621,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Pyro (2/3)',
      'url': 'vgYFsA2_r0o',
      'view_count': null,
      'id': 'vgYFsA2_r0o'
    },
    {
      'duration': 605,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Pyro (3/3)',
      'url': 'TTqLbKsQcys',
      'view_count': null,
      'id': 'TTqLbKsQcys'
    },
    {
      'duration': 610,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Medic (1/4)',
      'url': 'WQru9GDoFFA',
      'view_count': null,
      'id': 'WQru9GDoFFA'
    },
    {
      'duration': 617,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Medic (2/4)',
      'url': 'RM3_zn0It0M',
      'view_count': null,
      'id': 'RM3_zn0It0M'
    },
    {
      'duration': 597,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Medic (3/4)',
      'url': 'SvtxuD_Abf0',
      'view_count': null,
      'id': 'SvtxuD_Abf0'
    },
    {
      'duration': 578,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Medic (4/4)',
      'url': '6emlZFA1R80',
      'view_count': null,
      'id': '6emlZFA1R80'
    },
    {
      'duration': 635,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Soldier (1/3)',
      'url': 'daLpm5oxAs8',
      'view_count': null,
      'id': 'daLpm5oxAs8'
    },
    {
      'duration': 651,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Soldier (2/3)',
      'url': 'neZXUahkncE',
      'view_count': null,
      'id': 'neZXUahkncE'
    },
    {
      'duration': 650,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Soldier (3/3)',
      'url': '8qNRZZ2JBG8',
      'view_count': null,
      'id': '8qNRZZ2JBG8'
    },
    {
      'duration': 642,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Engineer (1/3)',
      'url': 'b7dDl0Ysh2s',
      'view_count': null,
      'id': 'b7dDl0Ysh2s'
    },
    {
      'duration': 624,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Engineer (2/3)',
      'url': 'dHccwlfV3DE',
      'view_count': null,
      'id': 'dHccwlfV3DE'
    },
    {
      'duration': 579,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Engineer (3/3)',
      'url': 'jSL9bBggtXw',
      'view_count': null,
      'id': 'jSL9bBggtXw'
    },
    {
      'duration': 659,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Demoman (1/3)',
      'url': 'eQ_d0kpHgUw',
      'view_count': null,
      'id': 'eQ_d0kpHgUw'
    },
    {
      'duration': 660,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Demoman (2/3)',
      'url': 'Zwy01TAIA0A',
      'view_count': null,
      'id': 'Zwy01TAIA0A'
    },
    {
      'duration': 649,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Demoman (3/3)',
      'url': 'S6zLi4t26mw',
      'view_count': null,
      'id': 'S6zLi4t26mw'
    },
    {
      'duration': 659,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Heavy (1/3)',
      'url': 'mrcsNLQZimg',
      'view_count': null,
      'id': 'mrcsNLQZimg'
    },
    {
      'duration': 658,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Heavy (2/3)',
      'url': 'lCMwV5zjDS8',
      'view_count': null,
      'id': 'lCMwV5zjDS8'
    },
    {
      'duration': 650,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Heavy (3/3)',
      'url': '9eyU49ZUHR4',
      'view_count': null,
      'id': '9eyU49ZUHR4'
    },
    {
      'duration': 659,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Scout (1/3)',
      'url': 'ctT8F3YG1oU',
      'view_count': null,
      'id': 'ctT8F3YG1oU'
    },
    {
      'duration': 658,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Scout (2/3)',
      'url': 'UUo13m2Atd8',
      'view_count': null,
      'id': 'UUo13m2Atd8'
    },
    {
      'duration': 656,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Scout (3/3)',
      'url': 'yMbcpqfCpwU',
      'view_count': null,
      'id': 'yMbcpqfCpwU'
    },
    {
      'duration': 648,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Sniper (1/3)',
      'url': 'KMp6i-StaoE',
      'view_count': null,
      'id': 'KMp6i-StaoE'
    },
    {
      'duration': 636,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Sniper (2/3)',
      'url': 'xwGCf6Dz5Es',
      'view_count': null,
      'id': 'xwGCf6Dz5Es'
    },
    {
      'duration': 567,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Sniper (3/3)',
      'url': '3RTxQL9uAV4',
      'view_count': null,
      'id': '3RTxQL9uAV4'
    },
    {
      'duration': 635,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Spy (1/3)',
      'url': 'HHSDAhfrbcY',
      'view_count': null,
      'id': 'HHSDAhfrbcY'
    },
    {
      'duration': 623,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Spy (2/3)',
      'url': 'ErkGiC2F-4o',
      'view_count': null,
      'id': 'ErkGiC2F-4o'
    },
    {
      'duration': 659,
      '_type': 'url',
      'ie_key': 'Youtube',
      'description': null,
      'uploader': 'Wickman',
      'title': '[OLD]Let\'s Play Team Fortress 2 - Spy (3/3)',
      'url': 'q1pd8A-8lqs',
      'view_count': null,
      'id': 'q1pd8A-8lqs'
    }
  ],
  'id': 'PLBB231211A4F62143',
  'webpage_url_basename': 'playlist'
}

The metadata that is being processed doesn't appear to contain any other possibly interesting fields, such as view count.

nose-gnome commented 2 years ago

Thank you for your response showing the solution for the missing playlist description.

On the matter of obtaining extra metadata like view_count, I found that they are stored inside:

 {
     'sidebar': {
         'metadata': {...} // Where the Title & Description are currently being extracted from.

         'playlistSidebarRenderer': {
             'items': [
                 0: {                
                     // As well as containing stats, items[0] also contains re-occurring information like Title, Description & thumbnail

                     'stats': [
                         0: {...}, // 0 Contains the number for videos in the playlist
                         1: {...}, // 1 Contains how many views the playlist has
                         2: {...} //  2 Contains the date of the last time the playlist was updated.
                     ]
                 }
             ]
         }
     }
 }

So, by making the following changes, I found how to include the view_count and the last_updated date in the output:

--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -968,7 +968,8 @@ class InfoExtractor(object):
             urls, playlist_id=playlist_id, playlist_title=playlist_title)

     @staticmethod
-    def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None):
+    def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None,
+                        playlist_view_count=None, playlist_last_update=None):
         """Returns a playlist"""
         video_info = {'_type': 'playlist',
                       'entries': entries}
@@ -978,6 +979,10 @@ class InfoExtractor(object):
             video_info['title'] = playlist_title
         if playlist_description:
             video_info['description'] = playlist_description
+        if playlist_view_count:
+            video_info['view_count'] = playlist_view_count
+        if playlist_last_update:
+            video_info['last_updated'] = playlist_last_update
         return video_info

     def _search_regex(self, pattern, string, name, default=NO_DEFAULT, fatal=True, flags=0, group=None):
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -2785,6 +2785,18 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
                 data, lambda x: x['metadata']['playlistMetadataRenderer'], dict)
             if renderer:
                 title = renderer.get('title')
+                description = renderer.get('description')
+
+                stats = try_get(
+                    data, lambda x: x['sidebar']['playlistSidebarRenderer']['items'][0]['playlistSidebarPrimaryInfoRenderer']['stats'])
+                view_count_text = try_get(
+                    stats, lambda x: x[1]['simpleText'], compat_str) or ''
+                view_count = str_to_int(self._search_regex(
+                    r'^([\d,]+)', re.sub(r'\s', '', view_count_text),
+                    'view count', default=None))
+
+                last_updated_text = try_get(stats, lambda x: x[2]['runs'][1]['text'])
+                last_updated = unified_strdate(last_updated_text)
             else:
                 renderer = try_get(
                     data, lambda x: x['header']['hashtagHeaderRenderer'], dict)
@@ -2793,7 +2805,9 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
         playlist = self.playlist_result(
             self._entries(selected_tab, item_id, webpage),
             playlist_id=playlist_id, playlist_title=title,
-            playlist_description=description)
+            playlist_description=description,
+            playlist_view_count=view_count,
+            playlist_last_update=last_updated)
         playlist.update(self._extract_uploader(data))
         return playlist

Then:

youtube-dl --flat-playlist -J PLZHQObOWTQDMXMi3bUMThGdYqos36X_lA

->

{
  "_type": "playlist",
  "entries": [
    {
      "_type": "url",
      "ie_key": "Youtube",
      "id": "GNcFjFmqEc8",
      "url": "GNcFjFmqEc8",
      "title": "But why is a sphere's surface area four times its shadow?",
      "description": null,
      "duration": 1021,
      "view_count": null,
      "uploader": "3Blue1Brown"
    },
    {
      "_type": "url",
      "ie_key": "Youtube",
      "id": "OkmNXy7er84",
      "url": "OkmNXy7er84",
      "title": "The hardest problem on the hardest test",
      "description": null,
      "duration": 675,
      "view_count": null,
      "uploader": "3Blue1Brown"
    },
    {
      "_type": "url",
      "ie_key": "Youtube",
      "id": "AmgkSdhK4K8",
      "url": "AmgkSdhK4K8",
      "title": "Who cares about topology?   (Inscribed rectangle problem)",
      "description": null,
      "duration": 1096,
      "view_count": null,
      "uploader": "3Blue1Brown"
    },
    {
      "_type": "url",
      "ie_key": "Youtube",
      "id": "pQa_tWZmlGs",
      "url": "pQa_tWZmlGs",
      "title": "Why slicing a cone gives an ellipse",
      "description": null,
      "duration": 772,
      "view_count": null,
      "uploader": "3Blue1Brown"
    },
    {
      "_type": "url",
      "ie_key": "Youtube",
      "id": "yuVqxCSsE7c",
      "url": "yuVqxCSsE7c",
      "title": "Sneaky Topology | The Borsuk-Ulam theorem and stolen necklaces",
      "description": null,
      "duration": 1190,
      "view_count": null,
      "uploader": "3Blue1Brown"
    },
    {
      "_type": "url",
      "ie_key": "Youtube",
      "id": "zwAD6dRSVyI",
      "url": "zwAD6dRSVyI",
      "title": "Thinking outside the 10-dimensional box",
      "description": null,
      "duration": 1627,
      "view_count": null,
      "uploader": "3Blue1Brown"
    },
    {
      "_type": "url",
      "ie_key": "Youtube",
      "id": "K8P8uFahAgc",
      "url": "K8P8uFahAgc",
      "title": "Circle Division Solution",
      "description": null,
      "duration": 533,
      "view_count": null,
      "uploader": "3Blue1Brown"
    }
  ],
  "id": "PLZHQObOWTQDMXMi3bUMThGdYqos36X_lA",
  "title": "Geometry",
  "view_count": 125486,
  "last_updated": "20191121",
  "uploader": "3Blue1Brown",
  "uploader_id": "UCYO_jab_esuFRV4b17AJtAw",
  "uploader_url": "https://www.youtube.com/c/3blue1brown",
  "extractor": "youtube:tab",
  "webpage_url": "https://www.youtube.com/playlist?list=PLZHQObOWTQDMXMi3bUMThGdYqos36X_lA",
  "webpage_url_basename": "playlist",
  "extractor_key": "YoutubeTab"
}
nose-gnome commented 2 years ago

I have made pull request #30161 with the changes you have shown, and the changes I made