tombulled / python-youtube-music

Python 3 YouTube Music Web API Client
GNU General Public License v3.0
64 stars 13 forks source link

Unknown AssertionError by album for certain YTM links #13

Closed fmigneault closed 2 years ago

fmigneault commented 3 years ago

I am using this package's YouTubeMusicDL class to obtain downloaded files from an album YTM link, with extended features in my project https://github.com/fmigneault/aiu.

When I use the link https://music.youtube.com/playlist?list=OLAK5uy_mBL38OIl1pSnLmoPXS8JuMTku9ojLc3Yg, I get the following error traceback:

$ aiu --link "https://music.youtube.com/watch?v=uHSl0Zpw2pw&list=OLAK5uy_mBL38OIl1pSnLmoPXS8JuMTku9ojLc3Yg"
[2021-09-15 02:11:08,308] ERROR      [MainThread][aiu] ParserError('album() encountered an error: Unknown',)
Traceback (most recent call last):
  File "d:\programs\development\audio info updater\aiu\utils.py", line 23, in log_exc
    return function(*args, **kwargs)
  File "d:\programs\development\audio info updater\aiu\main.py", line 341, in main
    meta_file, meta_json = get_metadata(link) if no_fetch else fetch_files(link, output_dir, show_progress=show)
  File "d:\programs\development\audio info updater\aiu\youtube.py", line 228, in fetch_files
    meta = api.download_album(ref_id, output_dir)  # pre-applied ID3 tags
  File "C:\Anaconda\envs\aiu\lib\site-packages\ytm\apis\YouTubeMusicDL\YouTubeMusicDL.py", line 262, in download_album
    album = self._api.album(album_id)
  File "C:\Anaconda\envs\aiu\lib\site-packages\ytm\decorators\_enforce.py", line 132, in wrapper
    resp = func(*args, **kwargs)
  File "C:\Anaconda\envs\aiu\lib\site-packages\ytm\decorators\enforce_return_value.py", line 52, in wrapper
    return func(*args, **kwargs)
  File "C:\Anaconda\envs\aiu\lib\site-packages\ytm\decorators\parse.py", line 53, in wrapper
    return parser(func(*args, **kwargs))
  File "C:\Anaconda\envs\aiu\lib\site-packages\ytm\decorators\_enforce.py", line 132, in wrapper
    resp = func(*args, **kwargs)
  File "C:\Anaconda\envs\aiu\lib\site-packages\ytm\decorators\enforce_parameters.py", line 52, in wrapper
    return func(*args, **kwargs)
  File "C:\Anaconda\envs\aiu\lib\site-packages\ytm\parsers\decorators\catch.py", line 59, in wrapper
    message = error_message,
ParserError: album() encountered an error: Unknown
(aiu)

My project simply forwards the link to YouTubeMusicDL in this case. There is not really any other step other than calling YouTubeMusicDL.download_album directly with the album ID extracted from the link. I have managed to extract the result of the func(*args, **kwargs) step as presented below. For some reason, an AssertionError is raised regarding album() which I cannot identify the cause.

func(*args, **kwargs) = {dict} <class 'dict'>: {'responseContext': {'serviceTrackingParams': [{'service': 'GFEEDBACK', 'params': [{'key': 'has_unlimited_entitlement', 'value': 'False'}, {'key': 'browse_id', 'value': 'MPREb_DqesLmS2l0r'}, {'key': 'logged_in', 'value': '0'}, {'key': 'e', 
 'responseContext' (2399200229680) = {dict} <class 'dict'>: {'serviceTrackingParams': [{'service': 'GFEEDBACK', 'params': [{'key': 'has_unlimited_entitlement', 'value': 'False'}, {'key': 'browse_id', 'value': 'MPREb_DqesLmS2l0r'}, {'key': 'logged_in', 'value': '0'}, {'key': 'e', 'value': '24002022,23996830,24056274,24096481,23884386,23966208,24097494,24080738,23983296,24091075,23946420,23804281,24087532,24002025,24087223,24036948,24088877,24037794,24090482,1714247,24002922,24077266,24087269,9407155,23944779,24085757,23857949,24001373,24028143,23968386,24084224,23975058,24077127,23998056,24050503,24049820,24097671,24004644,24085797,24007246,23918597,23882503,24085811,24082661,23744176,24101685,23934970,24007790,24076879'}]}, {'service': 'CSI', 'params': [{'key': 'c', 'value': 'WEB_REMIX'}, {'key': 'cver', 'value': '1.20210906.00.00'}, {'key': 'yt_li', 'value': '0'}, {'key': 'GetBrowseAlbumDetailPage_rid', 'value': '0xd6f9f6c9913a3d32'}]}, {'service': 'ECATCHER', 'params': [{'key': 'client.version', 'value': '1.20000101'}, {'key':...
 'trackingParams' (2399200231216) = {str} 'CAAQhGciEwjUmeTqqYDzAhWT8GAKHabrCvw='
 'contents' (2399200231280) = {dict} <class 'dict'>: {'singleColumnBrowseResultsRenderer': {'tabs': [{'tabRenderer': {'content': {'sectionListRenderer': {'contents': [{'musicShelfRenderer': {'contents': [{'musicResponsiveListItemRenderer': {'trackingParams': 'CJ4BEMn0AhgAIhMI1Jnk6qmA8wIVk_BgCh2m6wr8', 'overlay': {'musicItemThumbnailOverlayRenderer': {'background': {'verticalGradient': {'gradientLayerColors': ['3422552064', '3422552064']}}, 'content': {'musicPlayButtonRenderer': {'playNavigationEndpoint': {'clickTrackingParams': 'CKwBEMjeAiITCNSZ5OqpgPMCFZPwYAodpusK_A==', 'watchEndpoint': {'videoId': 'uHSl0Zpw2pw', 'playlistId': 'OLAK5uy_mBL38OIl1pSnLmoPXS8JuMTku9ojLc3Yg', 'loggingContext': {'vssLoggingContext': {'serializedContextData': 'GilPTEFLNXV5X21CTDM4T0lsMXBTbkxtb1BYUzhKdU1Ua3U5b2pMYzNZZw%3D%3D'}}, 'watchEndpointMusicSupportedConfigs': {'watchEndpointMusicConfig': {'musicVideoType': 'MUSIC_VIDEO_TYPE_OMV'}}}}, 'trackingParams': 'CKwBEMjeAiITCNSZ5OqpgPMCFZPwYAodpusK_A==', 'playIcon': {'iconType': 'PLAY_ARROW'}, 'pau...
 'header' (2399200622384) = {dict} <class 'dict'>: {'musicDetailHeaderRenderer': {'title': {'runs': [{'text': 'Lost in the Waves'}]}, 'subtitle': {'runs': [{'text': 'Album'}, {'text': ' • '}, {'text': 'LANDMVRKS', 'navigationEndpoint': {'clickTrackingParams': 'CAEQ99wCIhMI1Jnk6qmA8wIVk_BgCh2m6wr8', 'browseEndpoint': {'browseId': 'UCCSRXf2yg94pw0K4NPxkZ5w', 'browseEndpointContextSupportedConfigs': {'browseEndpointContextMusicConfig': {'pageType': 'MUSIC_PAGE_TYPE_ARTIST'}}}}}, {'text': ' • '}, {'text': '2020'}]}, 'menu': {'menuRenderer': {'items': [{'menuNavigationItemRenderer': {'text': {'runs': [{'text': 'Shuffle play'}]}, 'icon': {'iconType': 'MUSIC_SHUFFLE'}, 'navigationEndpoint': {'clickTrackingParams': 'CAUQpzsiEwjUmeTqqYDzAhWT8GAKHabrCvw=', 'watchPlaylistEndpoint': {'playlistId': 'OLAK5uy_mBL38OIl1pSnLmoPXS8JuMTku9ojLc3Yg', 'params': 'wAEB8gECKAE%3D'}}, 'trackingParams': 'CAUQpzsiEwjUmeTqqYDzAhWT8GAKHabrCvw='}}, {'menuNavigationItemRenderer': {'text': {'runs': [{'text': 'Start radio'}]}, 'icon': {'iconType': 'MIX'...
 'microformat' (2399200798832) = {dict} <class 'dict'>: {'microformatDataRenderer': {'urlCanonical': 'https://music.youtube.com/playlist?list=OLAK5uy_mBL38OIl1pSnLmoPXS8JuMTku9ojLc3Yg'}}

I have been able to use the exact same procedure with many other YTM links, but this one specifically (I have not yet been able to reproduce with others) causes this problem.

fmigneault commented 3 years ago

Stepping deeper, I reach the following part of the code: https://github.com/tombulled/python-youtube-music/blob/048fac906c8aa6ab52ec3d39c2715503226a7543/ytm/parsers/album.py#L37-L46

The raised AssertionError is due to empty raw_mutations.

tombulled commented 3 years ago

Hi @fmigneault, thanks for raising this issue and helping to identify the cause.

It appears that YouTube have changed the API response, therefore a new parser will need to be created to get this working again. Unfortunately I am no longer actively maintaining this library, as I have focused my efforts on this libraries successor, innertube.

Here's a head start for either yourself, or anyone else interested in creating an updated parser (I may be able to free up some time in the future to help out):

import ytm
import pprint

base = ytm.BaseYouTubeMusic()
dl   = ytm.YouTubeMusicDL()

album_playlist_id: str = 'OLAK5uy_mBL38OIl1pSnLmoPXS8JuMTku9ojLc3Yg'

page = base.page_playlist(album_playlist_id)

album_browse_id: str = ytm.utils.get \
(
    page,
    'INITIAL_ENDPOINT',
    'browseEndpoint',
    'browseId',
)

data = base.browse(album_browse_id)

album_title = ytm.utils.get \
(
    data,
    'header',
    'musicDetailHeaderRenderer',
    'title',
    'runs',
    0,
    'text',
)

music_shelf_contents = ytm.utils.get \
(
    data,
    'contents',
    'singleColumnBrowseResultsRenderer',
    'tabs',
    0,
    'tabRenderer',
    'content',
    'sectionListRenderer',
    'contents',
    0,
    'musicShelfRenderer',
    'contents',
    default = (),
)

tracks: list = []

for music_shelf_item in music_shelf_contents:
    music_shelf_item = ytm.utils.first(music_shelf_item)

    track_title = ytm.utils.get \
    (
        music_shelf_item,
        'flexColumns',
        0,
        'musicResponsiveListItemFlexColumnRenderer',
        'text',
        'runs',
        0,
        'text',
    )

    track_video_id = ytm.utils.get \
    (
        music_shelf_item,
        'playlistItemData',
        'videoId',
    )

    tracks.append \
    (
        dict \
        (
            title    = track_title,
            video_id = track_video_id,
        )
    )

album = dict \
(
    title  = album_title,
    tracks = tracks,
)

pprint.pprint(album)

Which should output:

{'title': 'Lost in the Waves',
 'tracks': [{'title': 'Lost in a Wave', 'video_id': 'uHSl0Zpw2pw'},
            {'title': 'Rainfall', 'video_id': 'HVAIr-BhEZI'},
            {'title': 'Silent', 'video_id': 'AIT1Rs7hiAg'},
            {'title': 'Visage', 'video_id': 'NS7nEtuZwbM'},
            {'title': 'Tired of It All', 'video_id': '-MqgkbxX9Dw'},
            {'title': 'Say No Word', 'video_id': 'cXy1YodCDgQ'},
            {'title': 'Always', 'video_id': '2r39hIbe4Xk'},
            {'title': 'Shoreline', 'video_id': 'PvfQ4OkjIpE'},
            {'title': 'Overrated', 'video_id': 'BfRecUhMIhM'},
            {'title': 'Paralyzed', 'video_id': 'YNLZUinqksw'}]}

On a side note, downloading has become much easier using innertube without requiring libraries such as youtube-dl. Here's a little example that outputs video data and streamable URLs:

import innertube
import pprint

client = innertube.InnerTube(innertube.Client.IOS_MUSIC)

data = client.player(video_id = 'uHSl0Zpw2pw')

pprint.pprint(data.videoDetails)
pprint.pprint(data.streamingData.adaptiveFormats)

Hope that helps, and please let me know if you have any more issues :slightly_smiling_face:

fmigneault commented 3 years ago

@tombulled
Thanks for the reply. I will take a look at the proposed corrections to adjust parsing of the data in my spare time. Regarding innertube, does it provide a similar method to download_album? Would there be additional operations to create to obtain the files from the data returned by client.player?

tombulled commented 3 years ago

Unfortunately innertube is a lower-level library so you won't find convenience functions/methods such as download_album as I consider them out of scope for the library.

Here's a quick example for how you might go about downloading a single song using innertube, I'll try and add an example later on for downloading an entire album

import innertube # https://github.com/tombulled/innertube

# Third-party libraries
import addict # https://github.com/mewwts/addict
import requests # https://github.com/psf/requests
import slugify # https://github.com/un33k/python-slugify
import mutagen.mp4, mutagen.easymp4 # https://github.com/quodlibet/mutagen

# Standard libraries
import mimetypes
import pathlib
import shutil

# Add M4A to the `mimetypes` standard library
# as it's not included by default
mimetypes.types_map['.m4a'] = 'audio/mp4'

# LANDMVRKS - Lost in a Wave
video_id = 'AbAuyD3S818'

# Two clients are used
#   - WEB_REMIX for video metadata
#   - IOS_MUSIC for streamable URLs
# API responses differ between clients so they can
# be easily mixed and matched to get the data you need
web = innertube.InnerTube(innertube.Client.WEB_REMIX)
ios = innertube.InnerTube(innertube.Client.IOS_MUSIC)

# Dispatch `player` requests to the InnerTube API
web_player = web.player(video_id = video_id)
ios_player = ios.player(video_id = video_id)

details = web_player.videoDetails

# For simplicity we'll pick the last format,
# however you may desire to be more selective
format = ios_player.streamingData.adaptiveFormats[-1]

# Construct file name
path = pathlib.Path \
(
    '{name}{extension}'.format \
    (
        name      = slugify.slugify(details.title),
        extension = mimetypes.guess_extension(format.mimeType.split(';')[0]),
    ),
)

# Download the audio file
with requests.get(format.url, stream = True) as response:
    with path.open('wb') as file:
        shutil.copyfileobj(response.raw, file)

# Download the album art and store its contents in memory
cover_image = requests.get(details.thumbnail.thumbnails[-1].url).content

# Create MP4 metadata tag containers
tags_easymp4 = mutagen.easymp4.EasyMP4()
tags_mp4     = mutagen.mp4.MP4()

# Add tags compatible with EasyMP4
tags_easymp4.update \
(
    dict \
    (
        title       = details.title,
        artist      = details.author,
        album       = 'Album Name Here', # Could just be omitted entirely if unknown
        albumartist = details.author,
        discnumber  = str(1),
        tracknumber = str(1),
    )
)

# Add tags incompatible with EasyMP4
tags_mp4.update \
(
    dict \
    (
        covr = \
        (
            mutagen.mp4.MP4Cover \
            (
                data        = cover_image,
                imageformat = mutagen.mp4.AtomDataType.JPEG,
            ),
        ),
    ),
)

# Write metadata tags
tags_easymp4.save(path)
tags_mp4.save(path)
tombulled commented 3 years ago

Here's an example to download an album/playlist:

import innertube # https://github.com/tombulled/innertube

# Third-party libraries
import addict # https://github.com/mewwts/addict
import furl # https://github.com/gruns/furl
import requests # https://github.com/psf/requests
import rich.console # https://github.com/willmcgugan/rich
import slugify # https://github.com/un33k/python-slugify
import mutagen.mp4, mutagen.easymp4 # https://github.com/quodlibet/mutagen

# Standard libraries
import mimetypes
import pathlib
import shutil

# Initialise a rich console for status updates
console = rich.console.Console()

# Add M4A to the `mimetypes` standard library
# as it's not included by default
mimetypes.types_map['.m4a'] = 'audio/mp4'

# InnerTube API client imitating iOS device
client = innertube.InnerTube(innertube.Client.IOS_MUSIC)

# URL to YouTube Music playlist
url = 'https://music.youtube.com/playlist?list=OLAK5uy_mBL38OIl1pSnLmoPXS8JuMTku9ojLc3Yg'

# Extract query parameters from the URL
params = addict.Dict(** furl.furl(url).query.params)

# Get the queue of tracks from the playlist
queue = client.music_get_queue(playlist_id = params.list)

# Extract the tracks from the queue response
tracks = \
[
    item.content.playlistPanelVideoRenderer
    for item in queue.queueDatas
]

with console.status("[bold green]Downloading tracks...") as status:
    for track in tracks:
        # Dispatch a `player` request to fetch streamable URLs
        player = client.player(video_id = track.videoId)

        # Extract video details
        details = player.videoDetails

        # For simplicity we'll pick the last format,
        # however you may desire to be more selective
        format = player.streamingData.adaptiveFormats[-1]

        # Construct file name
        path = pathlib.Path \
        (
            '{name}{extension}'.format \
            (
                name      = slugify.slugify(details.title),
                extension = mimetypes.guess_extension(format.mimeType.split(';')[0]),
            ),
        )

        # Download the audio file
        with requests.get(format.url, stream = True) as response:
            with path.open('wb') as file:
                shutil.copyfileobj(response.raw, file)

        # Download the album art and store its contents in memory
        cover_image = requests.get(details.thumbnail.thumbnails[-1].url).content

        # Create MP4 metadata tag containers
        tags_easymp4 = mutagen.easymp4.EasyMP4()
        tags_mp4     = mutagen.mp4.MP4()

        # Tags compatible with EasyMP4
        tags_easymp4.update \
        (
            dict \
            (
                title       = details.title,
                artist      = details.author,
                albumartist = details.author,
                discnumber  = str(1),
                tracknumber = str(1),
            )
        )

        # Tags incompatible with EasyMP4
        tags_mp4.update \
        (
            dict \
            (
                covr = \
                (
                    mutagen.mp4.MP4Cover \
                    (
                        data        = cover_image,
                        imageformat = mutagen.mp4.AtomDataType.JPEG,
                    ),
                ),
            ),
        )

        # Write metadata tags
        tags_easymp4.save(path)
        tags_mp4.save(path)

        # Make a status update logging successful download of the file
        console.log(f'Downloaded {details.title!r} -> {path.absolute()!s}')