Benjamin-Loison / YouTube-operational-API

YouTube operational API works when YouTube Data API v3 fails.
397 stars 50 forks source link

`commentThreads?part=snippet&videoId=VIDEO_ID` does not work anymore #286

Open Benjamin-Loison opened 4 months ago

Benjamin-Loison commented 4 months ago

As someone asked on Discord.

diff --git a/commentThreads.php b/commentThreads.php
index 2dd2df1..2e8689f 100644
--- a/commentThreads.php
+++ b/commentThreads.php
@@ -125,9 +125,9 @@ function getAPI($videoId, $commentId, $order, $continuationToken, $simulatedCont
         $texts = $comment['contentText']['runs'];
         $replies = $commentThread['replies'];
         $commentRepliesRenderer = $replies['commentRepliesRenderer'];
-        $text = implode(array_map(fn($text) => $text['text'], $texts));
+        $text = 'TEST';//implode(array_map(fn($text) => $text['text'], $texts));
         $commentId = $comment['commentId'];
-        $isHearted = array_key_exists('creatorHeart', $comment['actionButtons']['commentActionButtonsRenderer']);
+        $isHearted = false;//array_key_exists('creatorHeart', $comment['actionButtons']['commentActionButtonsRenderer']);
         $publishedAt = $comment['publishedTimeText']['runs'][0]['text'];
         $publishedAt = str_replace(' (edited)', '', $publishedAt, $count);
         $wasEdited = $count > 0;
@@ -144,7 +144,7 @@ function getAPI($videoId, $commentId, $order, $continuationToken, $simulatedCont
             'likeCount' => getIntValue(getValue($comment, 'voteCount/simpleText', defaultValue: 0)),
             'publishedAt' => $publishedAt,
             'wasEdited' => $wasEdited,
-            'isPinned' => array_key_exists('pinnedCommentBadge', $comment),
+            'isPinned' => false,//array_key_exists('pinnedCommentBadge', $comment),
             'authorIsChannelOwner' => $comment['authorIsChannelOwner'],
             'videoCreatorHasReplied' => $commentRepliesRenderer !== null && array_key_exists('viewRepliesCreatorThumbnail', $commentRepliesRenderer),
             // Could add the video creator thumbnails.

returns a very unsatisfying result, so maybe just a upper path somewhere changed.

Benjamin-Loison commented 4 months ago

Screen Shot 2024-07-08 at 14 31 08

getJSONPathFromKey dQw4w9WgXcQ.json | grep 'Amazing, crazy'
103 /frameworkUpdates/entityBatchUpdate/mutations/1/payload/commentEntityPayload/properties/content/content 1 BILLION views for Never Gonna Give You Up!  Amazing, crazy, wonderful! Rick ♥️
Benjamin-Loison commented 4 months ago

http://localhost/YouTube-operational-API/commentThreads?part=snippet,replies&videoId=373TksynowI http://localhost/YouTube-operational-API/commentThreads?part=snippet,replies&id=UgyIHL-bHj3fgnXduOd4AaABAg&videoId=373TksynowI http://localhost/YouTube-operational-API/commentThreads?part=snippet,replies&id=UgyIHL-bHj3fgnXduOd4AaABAg&videoId=373TksynowI&order=time Should also verify pageToken.

Except potential pageToken, this endpoint does not seem to work at all.

On my Linux Mint 21.3 Cinnamon Framework 13 I do not have any pageToken history entry for this endpoint, so let us remake this endpoint while trying to keep untouched pageToken potentially working code.

Benjamin-Loison commented 4 months ago

Ideally should update the Stack Overflow question I have answered probably.

Benjamin-Loison commented 4 months ago

Related to Improve_websites_thanks_to_open_source/issues/756.

Benjamin-Loison commented 4 months ago
import requests
import json
import blackboxprotobuf
import base64

url = 'https://www.youtube.com/youtubei/v1/next'

def getBase64Protobuf(message, typedef):
    data = blackboxprotobuf.encode_message(message, typedef)
    return base64.b64encode(data).decode('ascii')

message = {
    '2': {
        '2': '373TksynowI'
    },
    '3': 6,
    '6': {
        '3': {
            '2': 'UgykNvQf9i9iPnhPgct4AaABAg',
            '5': 'UCt5USYpzzMCYhkirVQGHwKQ',
            '6': '373TksynowI',
        },
    }
}

typedef = {
    '2': {
        'type': 'message',
        'message_typedef': {
            '2': {
                'type': 'string'
            }
        },
        'field_order': [
            '2'
        ]
    },
    '3': {
        'type': 'int'
    },
    '6': {
        'type': 'message',
        'message_typedef': {
            '3': {
                'type': 'message',
                'message_typedef': {
                    '2': {
                        'type': 'string'
                    },
                    '5': {
                        'type': 'string'
                    },
                    '6': {
                        'type': 'string'
                    },
                },
                'field_order': [
                    '2',
                    '5',
                    '6',
                ]
            },
        },
        'field_order': [
            '3',
        ]
    }
}

continuation = getBase64Protobuf(message, typedef)

data = {
    'context': {
        'client': {
            'clientName': 'WEB',
            'clientVersion': '2.20240325.01.00'
        }
    },
    'continuation': continuation
}

data = requests.post(url, json = data).json()
#print(json.dumps(data, indent = 4))
print('Pas trop effectivement' in str(data))

is minimized to list first replies.

Benjamin-Loison commented 4 months ago

https://www.youtube.com/watch?v=373TksynowI&lc=UgykNvQf9i9iPnhPgct4AaABAg https://www.youtube.com/watch?v=373TksynowI&lc=UgyIHL-bHj3fgnXduOd4AaABAg

Benjamin-Loison commented 4 months ago
import requests
import json
import blackboxprotobuf
import base64

url = 'https://www.youtube.com/youtubei/v1/next'

def getBase64Protobuf(message, typedef):
    data = blackboxprotobuf.encode_message(message, typedef)
    return base64.b64encode(data).decode('ascii')

message = {
    '2': {
        '2': '373TksynowI'
    },
    '3': 6,
    '6': {
        '3': {
            '2': 'UgyIHL-bHj3fgnXduOd4AaABAg',
            '5': 'UCt5USYpzzMCYhkirVQGHwKQ',
            '6': '373TksynowI',
            '9': 50,
        },
    }
}

typedef = {
    '2': {
        'type': 'message',
        'message_typedef': {
            '2': {
                'type': 'string'
            }
        },
        'field_order': [
            '2'
        ]
    },
    '3': {
        'type': 'int'
    },
    '6': {
        'type': 'message',
        'message_typedef': {
            '3': {
                'type': 'message',
                'message_typedef': {
                    '2': {
                        'type': 'string'
                    },
                    '5': {
                        'type': 'string'
                    },
                    '6': {
                        'type': 'string'
                    },
                    '9': {
                        'type': 'int'
                    },
                },
                'field_order': [
                    '2',
                    '5',
                    '6',
                    '9',
                ]
            },
        },
        'field_order': [
            '3',
        ]
    }
}

continuation = getBase64Protobuf(message, typedef)

data = {
    'context': {
        'client': {
            'clientName': 'WEB',
            'clientVersion': '2.20240325.01.00'
        }
    },
    'continuation': continuation
}

data = requests.post(url, json = data).json()
#print(json.dumps(data, indent = 4))
print('stonks' in str(data))

is minized to list following replies.

Benjamin-Loison commented 4 months ago

Except get_ranked_streams (as my usual method at innertube-documentation/issues/1#issuecomment-1688923923 does not work out of the box even if add one or two =) the following to list further comments is minimized:

import requests
import json
import blackboxprotobuf
import base64

url = 'https://www.youtube.com/youtubei/v1/next'

def getBase64Protobuf(message, typedef):
    data = blackboxprotobuf.encode_message(message, typedef)
    return base64.b64encode(data).decode('ascii')

message = {
    '2': {
        '2': '373TksynowI'
    },
    '3': 6,
    '6': {
        '1': 'get_ranked_streams--CENSORED',
        '4': {
            '4': '373TksynowI',
        },
        '8': 'comments-section'
    }
}

typedef = {
    '2': {
        'type': 'message',
        'message_typedef': {
            '2': {
                'type': 'string'
            }
        },
        'field_order': [
            '2'
        ]
    },
    '3': {
        'type': 'int'
    },
    '6': {
        'type': 'message',
        'message_typedef': {
            '1': {
                'type': 'string'
            },
            '4': {
                'type': 'message',
                'message_typedef': {
                    '4': {
                        'type': 'string'
                    },
                    '6': {
                        'type': 'int'
                    },
                    '15': {
                        'type': 'int'
                    }
                },
                'field_order': [
                    '4',
                    '6',
                    '15'
                ]
            },
            '5': {
                'type': 'int'
            },
            '8': {
                'type': 'string'
            }
        },
        'field_order': [
            '1',
            '4',
            '5',
            '8'
        ]
    }
}

continuation = getBase64Protobuf(message, typedef)

data = {
    'context': {
        'client': {
            'clientName': 'WEB',
            'clientVersion': '2.20240325.01.00'
        }
    },
    'continuation': continuation
}

data = requests.post(url, json = data).json()
#print(json.dumps(data, indent = 4))
print('UgyYQbEfvbN2hP54EV14AaABAg' in str(data))

CENSORED being:

-----BEGIN PGP MESSAGE-----

hF4DTQa9Wom5MBgSAQdAIxHJy7uGo4wCDXAE7LxYx21q8zRKvFspOgjGxFSI2lcw
IPoKglya9t42Kv6qDeiyZ2xGq8XehBO0UT+MZWISAlDGV8LQpAT6tl1CuqxYK2UJ
0ukBWocahzt5Co7IHLV7NbKM7IBxjg47DCTX8c2NOPe2dXOmaQD7A738VrQGTRMs
y6Hnb14OULMyYulFrBuCq9Pb0ZFc7z35W3wt/mOc1X/iep7fqwxzwobIzHl+l3R+
ZoMyMFI3GxoCeUQDoIdUya5sc269lg7ZxTakRCOK0ah25sDYzILLtjTiPRIdQyfc
WHjRpL+7C5qVNOFUvTQeQxxYwOmFVNNa5tw0AA7/y/UfOSrbm0JOLN55zPLMQaUT
CgV6DwGXJ8B8sh6DQ4tt+67NWdbubzFniqMp3p4/7ppinvnUErvqUD5aHWXOq5ny
FN8xO4ObHYYlPvCnJ2w4RHi6Uow9Obl8jr2oX8ru8WTTSabEgLvqZ7qzvXaLh4GZ
WvCcG7j/M1gnohCRcM1sonF+3AJlA6NwYftk/aY8QN1ZZWDJX7GGXVn6upq7OagU
WHK3aKVIB5Xkamel1dh7u9Fh5i2vkgDpowR+WNpF7mXf4bLkJEQ4jiAHhcARf4tp
4eQ42AE2y9h+ESkMm2LEROkS6mWAi+KsifCh8shQsUDbd8sSRNFsEl8UPYdTxTH9
PNx+Tza7PGyzxZWfBX0EuG8hmiHFe1C7b9WPGuFbzSFpkwUndgbQZl2jJDmyGDve
s/Ny13ftM5zBIc/QrIkan84ZG6jCmcliHfaWU6VSOz8pyheUGlzzVklzMnEYXdJR
FsX/rYmJLTSAuQ==
=+joE
-----END PGP MESSAGE-----
Benjamin-Loison commented 4 months ago
diff --git a/commentThreads.php b/commentThreads.php
index 2dd2df1..89a4ae1 100644
--- a/commentThreads.php
+++ b/commentThreads.php
@@ -107,50 +107,40 @@ function getAPI($videoId, $commentId, $order, $continuationToken, $simulatedCont
     }

     $answerItems = [];
-    $onResponseReceivedEndpoints = $result['onResponseReceivedEndpoints'];
-    $reloadContinuationItems = $onResponseReceivedEndpoints[1]['reloadContinuationItemsCommand']['continuationItems'];
-    $appendContinuationItems = $onResponseReceivedEndpoints[0]['appendContinuationItemsAction']['continuationItems'];
-    $items = array_merge($reloadContinuationItems !== null ? $reloadContinuationItems : [], $appendContinuationItems !== null ? $appendContinuationItems : []);
-    if ($items !== [] && array_key_exists('continuationItemRenderer', end($items))) {
-        $continuationItemRenderer = end($items)['continuationItemRenderer'];
-        $commonSuffix = 'continuationCommand/token';
-        $nextContinuationToken = urldecode(getValue($continuationItemRenderer, "continuationEndpoint/$commonSuffix", "button/buttonRenderer/command/$commonSuffix"));
-        $items = array_slice($items, 0, count($items) - 1);
-    }
+    $items = $result['frameworkUpdates']['entityBatchUpdate']['mutations'];
     $isTopLevelComment = true;
-    foreach ($items as $item) {
-        $commentThread = $item['commentThreadRenderer'];
-        $isTopLevelComment = array_key_exists('commentThreadRenderer', $item);
-        $comment = ($isTopLevelComment ? $commentThread['comment'] : $item)['commentRenderer'];
-        $texts = $comment['contentText']['runs'];
-        $replies = $commentThread['replies'];
-        $commentRepliesRenderer = $replies['commentRepliesRenderer'];
-        $text = implode(array_map(fn($text) => $text['text'], $texts));
-        $commentId = $comment['commentId'];
-        $isHearted = array_key_exists('creatorHeart', $comment['actionButtons']['commentActionButtonsRenderer']);
-        $publishedAt = $comment['publishedTimeText']['runs'][0]['text'];
+    foreach ($items as $item) {
+        $payload = $item['payload'];
+        if (array_key_exists('engagementToolbarStateEntityPayload', $payload)) {
+            $answerItems[$item['entityKey']]['snippet']['topLevelComment']['snippet']['creatorHeart'] = $payload['engagementToolbarStateEntityPayload']['heartState'] == 'TOOLBAR_HEART_STATE_HEARTED';
+        }
+        if (!array_key_exists('commentEntityPayload', $payload)) {
+            continue;
+        }
+        $comment = $payload['commentEntityPayload'];
+        $properties = $comment['properties'];
+        $author = $comment['author'];
+        $toolbar = $comment['toolbar'];
+        $publishedAt = $properties['publishedTime'];
         $publishedAt = str_replace(' (edited)', '', $publishedAt, $count);
-        $wasEdited = $count > 0;
-        $replyCount = $comment['replyCount'];
-        $author = $comment['authorText']['simpleText'];
-        $isAuthorAHandle = $author[0] === '@';
         $internalSnippet = [
-            'textOriginal' => $text,
-            'isHearted' => $isHearted,
-            'authorName' => $isAuthorAHandle ? null : $author,
-            'authorHandle' => $isAuthorAHandle ? $author : null,
-            'authorProfileImageUrls' => $comment['authorThumbnail']['thumbnails'],
-            'authorChannelId' => ['value' => $comment['authorEndpoint']['browseEndpoint']['browseId']],
-            'likeCount' => getIntValue(getValue($comment, 'voteCount/simpleText', defaultValue: 0)),
+            'content' => $properties['content']['content'],
             'publishedAt' => $publishedAt,
-            'wasEdited' => $wasEdited,
-            'isPinned' => array_key_exists('pinnedCommentBadge', $comment),
-            'authorIsChannelOwner' => $comment['authorIsChannelOwner'],
-            'videoCreatorHasReplied' => $commentRepliesRenderer !== null && array_key_exists('viewRepliesCreatorThumbnail', $commentRepliesRenderer),
-            // Could add the video creator thumbnails.
-            'totalReplyCount' => $replyCount !== null ? intval($replyCount) : 0,
-            'nextPageToken' => urldecode($replies['commentRepliesRenderer']['contents'][0]['continuationItemRenderer']['continuationEndpoint']['continuationCommand']['token'])
+            'wasEdited' => $count > 0,
+            'authorChannelId' => $author['channelId'],
+            'authorHandle' => $author['displayName'],
+            'authorName' => str_replace('❤ by ', '', $toolbar['heartActiveTooltip']),
+            'authorAvatar' => $comment['avatar']['image']['sources'][0],
+            'isCreator' => $author['isCreator'],
+            'isArtist' => $author['isArtist'],
+            'likeCount' => getIntValue($toolbar['likeCountLiked']),
+            'totalReplyCount' => intval($toolbar['replyCount']),
+            'videoCreatorHasReplied' => false,
+            'isPinned' => false,
         ];
+
+        //$replies = $commentThread['replies'];
+        $commentId = $properties['commentId'];
         $answerItem = [
             'kind' => 'youtube#comment' . ($isTopLevelComment ? 'Thread' : ''),
             'etag' => 'NotImplemented',
@@ -164,8 +154,21 @@ function getAPI($videoId, $commentId, $order, $continuationToken, $simulatedCont
                 ]
             ] : $internalSnippet)
         ];
-        array_push($answerItems, $answerItem);
+        $answerItems[$properties['toolbarStateKey']] = $answerItem;
     }
+    foreach ($result['onResponseReceivedEndpoints'][1]['reloadContinuationItemsCommand']['continuationItems'] as $item) {
+        $commentThreadRenderer = $item['commentThreadRenderer'];
+        $toolbarStateKey = $commentThreadRenderer['commentViewModel']['commentViewModel']['toolbarStateKey'];
+        // How to avoid repeating path?
+        if (doesPathExist($commentThreadRenderer, 'replies/commentRepliesRenderer/viewRepliesCreatorThumbnail')) {
+            $answerItems[$toolbarStateKey]['snippet']['topLevelComment']['snippet']['videoCreatorHasReplied'] = true;
+        }
+        if (doesPathExist($commentThreadRenderer, 'commentViewModel/commentViewModel/pinnedText')) {
+            $answerItems[$toolbarStateKey]['snippet']['topLevelComment']['snippet']['isPinned'] = true;
+        }
+    }
+    $answerItems = array_values($answerItems);
+
     $answer = [
         'kind' => 'youtube#comment' . ($isTopLevelComment ? 'Thread' : '') . 'ListResponse',
         'etag' => 'NotImplemented',
Benjamin-Loison commented 2 months ago

Let us now focus on pagination to solve the Stack Overflow question 78877647.

Was there an initial Stack Overflow my first implementation solved? It seems that it was the Stack Overflow question 74936220, according to the commit 2b574ab1ba85513067a9bbc537edc606179dfe1e.

I verified the issue with https://yt.lemnoslife.com/noKey/commentThreads?part=snippet,replies&videoId=wLSUDSNqLgQ.

I verified that we cannot get details and answers from the comment ids with YouTube Data API v3 CommentThreads: list and Comments: list endpoints. They do not work fine with https://www.youtube.com/watch?v=wLSUDSNqLgQ&lc=UgyyaVh4LgiYOTJBJ8p4AaABAg but works fine with https://www.youtube.com/watch?v=WLfF-mAKpLY&lc=UgyS762e1ZpWrKp0ibR4AaABAg.

Note that it seems that the Stack Overflow OP is not interested in time order.

curl -s 'https://yt.lemnoslife.com/commentThreads?part=replies&videoId=wLSUDSNqLgQ' | jq .items[].snippet.topLevelComment.snippet.content
curl -s 'https://yt.lemnoslife.com/commentThreads?part=replies&videoId=wLSUDSNqLgQ' | jq .items[].id
curl -s 'https://yt.lemnoslife.com/commentThreads?part=replies&videoId=wLSUDSNqLgQ&pageToken=Eg0SC3dMU1VEU05xTGdRGAYy1AIKqgJnZXRfcmFua2VkX3N0cmVhbXMtLUNxWUJDSUFFRlJlMzBUZ2Ftd0VLbGdFSTJGOFFnQVFZQnlLTEFUdGltSUlsb19qYXNIcmowbnpuZF96QUhleU5LTmtGaHkycnRNTHNtVl9TM1l1c0JWODJtUEVWd09VR0JJLTZZOWFVSEI2SXVtdkdDWTNKbVFIZzRpYWdBazlZbzg5VXkwUTg0NDd3UFpOOGFRbHJEV2FWcU9leDBzakhyRXBTRUtrZ3VvcXcwdUVNMFIwRXBxQlJDVFVwMG12eUZLb3JIdDBtQ09GMmhmTU1wUDR4TExiMC1DTGNFQm9RRkJJRkNJZ2dHQUFTQlFpb0lCZ0FFZ1VJaHlBWUFCSUZDSWtnR0FBU0J3aUZJQkFVR0FFWUFBIhEiC3dMU1VEU05xTGdRMAB4ASgUQhBjb21tZW50cy1zZWN0aW9u' | jq .items[].id

So it works fine even for videos with YouTube Data API v3 disabled comments.

Benjamin-Loison commented 2 months ago

http://localhost/YouTube-operational-API/commentThreads?part=replies&videoId=mWdFMNQBcjs

http://localhost/YouTube-operational-API/commentThreads?part=replies&videoId=mWdFMNQBcjs&pageToken=Eg0SC21XZEZNTlFCY2pzGAYygwEaUBIaVWd6VDlCQTl1UWhYdzA1UTJJcDRBYUFCQWciAggAKhhVQ3ZfTHFGSS0wdk1WWWdOUjNUZUIzelEyC21XZEZNTlFCY2pzQAFICoIBAggBQi9jb21tZW50LXJlcGxpZXMtaXRlbS1VZ3pUOUJBOXVRaFh3MDVRMklwNEFhQUJBZw%3D%3D

works fine to list answers to first comment.

Note that in fact it does not seem that the Stack Overflow OP wants this feature but here it is anyway.

Note that at the end will have to check with https://www.youtube.com/watch?v=wLSUDSNqLgQ just for making sure.