Closed EtorixDev closed 2 months ago
Hi, apologies for the late reply, I'll take a look into this now
I've been able to reproduce the ability to list the first n comments (either "top" or "newest").
Here's the (admittedly lashed together) script I used:
from innertube import InnerTube
ENGAGEMENT_SECTION_COMMENTS = "engagement-panel-comments-section"
C0MMENTS_TOP = "Top comments"
COMMENTS_NEWEST = "Newest first"
def parse_text(text):
return "".join(run["text"] for run in text["runs"])
def extract_engagement_panels(next_data):
engagement_panels = {}
raw_engagement_panels = next_data.get("engagementPanels", [])
for raw_engagement_panel in raw_engagement_panels:
engagement_panel = raw_engagement_panel.get(
"engagementPanelSectionListRenderer", {}
)
target_id = engagement_panel.get("targetId")
engagement_panels[target_id] = engagement_panel
return engagement_panels
def parse_sort_filter_sub_menu(menu):
menu_items = menu["sortFilterSubMenuRenderer"]["subMenuItems"]
return {menu_item["title"]: menu_item for menu_item in menu_items}
def extract_comments(next_continuation_data):
return [
continuation_item["commentThreadRenderer"]
for continuation_item in next_continuation_data["onResponseReceivedEndpoints"][
1
]["reloadContinuationItemsCommand"]["continuationItems"][:-1]
]
# YouTube Web CLient
client = InnerTube("WEB", "2.20240105.01.00")
# ShortCircuit - Dell just DESTROYED the Surface Pro! - Dell XPS 13 2-in-1
video = client.next("BV1O7RR-VoA")
engagement_panels = extract_engagement_panels(video)
comments = engagement_panels[ENGAGEMENT_SECTION_COMMENTS]
comments_header = comments["header"]["engagementPanelTitleHeaderRenderer"]
comments_title = parse_text(comments_header["title"])
comments_context = parse_text(comments_header["contextualInfo"])
comments_menu_items = parse_sort_filter_sub_menu(comments_header["menu"])
comments_top = comments_menu_items[C0MMENTS_TOP]
comments_top_continuation = comments_top["serviceEndpoint"]["continuationCommand"][
"token"
]
print(f"{comments_title} ({comments_context})...")
print()
comments_continuation = client.next(continuation=comments_top_continuation)
comments = extract_comments(comments_continuation)
for comment in comments:
comment_renderer = comment["comment"]["commentRenderer"]
comment_author = comment_renderer["authorText"]["simpleText"]
comment_content = parse_text(comment_renderer["contentText"])
print(f"[{comment_author}]")
print(comment_content)
print()
$ python app.py
Comments (1.7K)...
[@ViXoZuDo]
I would 100% prefer the headphone jack over that camera...
[@ouilsen2]
As a Surface Pro user I have one observation...
...
(I'll add this to the examples/
directory in case it helps anyone else)
I'll have a fiddle with highlighting a comment now in case I can figure out what's going on there
It looks like highlighting a comment sends off a request to the /next
endpoint with some params
and the videoId
. I'll see if I can whip up a quick PoC for this now
I think I've figured out what was happening with highlighting a comment not working. The continuation tokens for "top" and "newest" you can extract from engagementPanels
aren't influenced by the params
passed to the /next
endpoint, however the continuation token for the comment-item-section
does change.
The below example ignores the engagementPanels
entirely and instead uses the continuation token for the comments item section:
from innertube import InnerTube
# YouTube Web CLient
CLIENT = InnerTube("WEB", "2.20240105.01.00")
def parse_text(text):
return "".join(run["text"] for run in text["runs"])
def flatten(items):
flat_items = {}
for item in items:
key = next(iter(item))
val = item[key]
flat_items.setdefault(key, []).append(val)
return flat_items
def flatten_item_sections(item_sections):
return {
item_section["sectionIdentifier"]: item_section
for item_section in item_sections
}
def extract_comments(next_continuation_data):
return [
continuation_item["commentThreadRenderer"]
for continuation_item in next_continuation_data["onResponseReceivedEndpoints"][
1
]["reloadContinuationItemsCommand"]["continuationItems"][:-1]
]
def extract_comments_continuation_token(next_data):
contents = flatten(
next_data["contents"]["twoColumnWatchNextResults"]["results"]["results"][
"contents"
]
)
item_sections = flatten_item_sections(contents["itemSectionRenderer"])
comment_item_section_content = item_sections["comment-item-section"]["contents"][0]
comments_continuation_token = comment_item_section_content[
"continuationItemRenderer"
]["continuationEndpoint"]["continuationCommand"]["token"]
return comments_continuation_token
def get_comments(video_id, params=None):
video = CLIENT.next(video_id, params=params)
continuation_token = extract_comments_continuation_token(video)
comments_continuation = CLIENT.next(continuation=continuation_token)
return extract_comments(comments_continuation)
def print_comment(comment):
comment_renderer = comment["comment"]["commentRenderer"]
comment_author = comment_renderer["authorText"]["simpleText"]
comment_content = parse_text(comment_renderer["contentText"])
print(f"[{comment_author}]")
print(comment_content)
print()
video_id = "BV1O7RR-VoA"
# Get comments for a given video
comments = get_comments(video_id)
# Select a comment to highlight (in this case the 3rd one)
comment = comments[2]
# Print the comment we're going to highlight
print("### Highlighting Comment: ###")
print()
print_comment(comment)
print("---------------------")
print()
# Extract the 'params' to highlight this comment
params = comment["comment"]["commentRenderer"]["publishedTimeText"]["runs"][0][
"navigationEndpoint"
]["watchEndpoint"]["params"]
# Get comments, but highlighting the selected comment
highlighted_comments = get_comments(video_id, params=params)
print("### Comments: ###")
print()
for comment in highlighted_comments:
print_comment(comment)
$ python app.py
### Highlighting Comment: ###
[@alphacompton]
The built in mic on the 2-1 is exceptional and the camera is excellent from your video sample. Look like a better buy especially if it's cheaper than the Surface pro.
---------------------
### Comments: ###
[@alphacompton]
The built in mic on the 2-1 is exceptional and the camera is excellent from your video sample. Look like a better buy especially if it's cheaper than the Surface pro.
[@ouilsen2]
As a Surface Pro user I have one observation....
...
Hope that helps!
Please let me know if you have any further questions, or if this answers your query
Best, Tom
Hi, thanks for the detailed reply.
The idea behind the highlighting was to store a reference (such as the comment ID) to it in a database and come back to it later. One such use case would be a system that checks for the existence of a membership badge on a user's message monthly. That's why it would have been ideal to have a way to programmatically jump straight to the comment in 1 request like in the browser (on the initial lookup, not just subsequent ones).
Unfortunately from your response it seems "highlighting" a comment internally is done with the comment's watchEndpoint
params
, so the initial request for the comment will require scraping them all until the target comment is found by checking for the comment ID, and then storing the params
instead of the comment ID for future immediate lookup.
Would this work, or do you suspect the params
of comments change often?
Thanks again.
Hi @EtorixDev, apologies for the late turn around on a reply to your last comment. I believe the params
field contains base-64 encoded protobuf data (potentially also url-encoded). You should be able to decode the contents of the param using a tool such as https://protobuf-decoder.netlify.app/. It is possible that the protobuf structure contains the comment ID, and that all other fields are static. If this is the case, you should be able to generate the correct params
value knowing only the comment ID.
Unfortunately I went to test this using the examples/list-video-comments-highlighted.py
example script I wrote a while back and it seems YouTube has changed their comments API around again. If I get some spare time I'll give the API another poke, however I hope this comment has at least given you a bit of a steer :slightly_smiling_face:
Hello, I notice in #17 it's stated that getting comments is not part of the InnerTube API. I'm not sure if things have changed or if I am misunderstanding what constitutes as part of the InnerTube API, but by doing the following I have managed to get the comments:
https://www.youtube.com/youtubei/v1/next?key={key}
with the specified video ID in the data.Something I've yet to figure out is how to get a highlighted comment to appear at the top of the json list. If you click on a YouTube comment's date, it will open a link with a "&lc=" param that has the comment's ID. And in the comments it will appear at the top as "Highlighted".
If I use the continuation token for the second request from the dev tools inspector when loading the highlighted comment link in the browser then the second next request properly returns the highlighted comment at the top of the json list.
However, if I try using the continuation retrieved from the first next request programmatically then it always returns the comments without the highlighted comment at the top, so it can be assumed the highlighted comment is tied to the continuation token which seems to be generated outside of the scope of the next endpoint, unless I've simply not found the correct way yet.