Open Benjamin-Loison opened 3 months ago
curl -s 'https://www.youtube.com/channel/UC2ChxHEZCmK5Nj4JB649iKA/community?lb=Ugkx4zW_Z6QeKVSnRzPmnF5pAdDIJ4tBikSo' > a
getJSONPathFromKey a | grep 'a9qxjTfH'
208 /contents/twoColumnBrowseResultsRenderer/tabs/0/tabRenderer/content/sectionListRenderer/contents/0/itemSectionRenderer/contents/0/backstagePostThreadRenderer/post/backstagePostRenderer/contentText/runs/0/text a9qxjTfH
48 /microformat/microformatDataRenderer/description a9qxjTfH
jq .microformat a
In theory can lower bound by the channel creation date in About tab.
I believe that days_since_epoch=19939
is channel thumbnail related but let us check.
https://www.epochconverter.com/seconds-days-since-y0#d1970
So it correponds to today...
import datetime
print(datetime.datetime(1970, 1, 1, 0, 0) + datetime.timedelta(19939))
jq '.contents.twoColumnBrowseResultsRenderer.tabs[0].tabRenderer.content.sectionListRenderer.contents[0].itemSectionRenderer.contents[0]' a
echo -n 'CBMQ9LwCIhMI5riXoJvbhwMV9sJJBx3MEy4d' | base64 -d | protoc --decode_raw
date -d @1722770237
Sun Aug 4 01:17:17 PM CEST 2024
is today.
echo -n 'CBgQr9gCIhMI5riXoJvbhwMV9sJJBx3MEy4d' | base64 -d | protoc --decode_raw
so the same timestamp.
echo -n 'CBcQsNgCIhMI5riXoJvbhwMV9sJJBx3MEy4d' | base64 -d | protoc --decode_raw
same timestamp.
echo -n 'Egljb21tdW5pdHnKASeyASRVZ2t4NHpXX1o2UWVLVlNuUnpQbW5GNXBBZERJSjR0QmlrU2_qAgQQARgB' | base64 -d
community�'�$Ugkx4zW_Z6QeKVSnRzPmnF5pAdDIJ4tBikSbase64: invalid input
does not seem interesting even with one or two appending =
it does not solve the error, hence no Protobuf decoding seems possible.
import blackboxprotobuf
import base64
import json
data = base64.b64decode('Egljb21tdW5pdHnKASeyASRVZ2t4NHpXX1o2UWVLVlNuUnpQbW5GNXBBZERJSjR0QmlrU2_qAgQQARgB', altchars = '-_')
message, typedef = blackboxprotobuf.decode_message(data)
print(json.dumps(message, indent = 4))
Source: menmob/innertube-documentation/issues/1#issuecomment-1688923923
echo -n 'Egljb21tdW5pdHnKASeyASRVZ2t4NHpXX1o2UWVLVlNuUnpQbW5GNXBBZERJSjR0QmlrU2_qAgQQARgB' | base64url -d | protoc --decode_raw
echo -n 'CBYQmE0iEwjmuJegm9uHAxX2wkkHHcwTLh0=' | base64 -d | protoc --decode_raw
same timestamp.
echo -n 'CBUQmE0iEwjmuJegm9uHAxX2wkkHHcwTLh0=' | base64 -d | protoc --decode_raw
same timestamp.
echo -n 'CBQQtXUiEwjmuJegm9uHAxX2wkkHHcwTLh0=' | base64 -d | protoc --decode_raw
same timestamp.
echo -n 'qgcoCAMSJFVna3g0eldfWjZRZUtWU25SelBtbkY1cEFkRElKNHRCaWtTbw==' | base64 -d | protoc --decode_raw
not interesting.
echo -n 'CBIQzL8CGAAiEwjmuJegm9uHAxX2wkkHHcwTLh0=' | base64 -d | protoc --decode_raw
same timestamp.
So all tracking seem uninteresting.
https://www.youtube.com/youtubei/v1/browse continuation
:
echo -n '4qmFsgL_ARIYVUMyQ2h4SEVaQ21LNU5qNEpCNjQ5aUtBGuIBRWdsamIyMXRkVzVwZEhtNEFRREtBU2V5QVNSVloydDROSHBYWDFvMlVXVkxWbE51VW5wUWJXNUdOWEJCWkVSSlNqUjBRbWxyVTJfcUFnUVFBUmdCa2dNQXFnTmJJa2N3QU5nQkFlb0JKRlZuYTNnMGVsZGZXalpSWlV0V1UyNVNlbEJ0YmtZMWNFRmtSRWxLTkhSQ2FXdFRiX0lCR0ZWRE1rTm9lRWhGV2tOdFN6Vk9halJLUWpZME9XbExRVUlRWTI5dGJXVnVkSE10YzJWamRHbHZidklHQkFvQ1NnQSUzRA==' | base64url -d | protoc --decode_raw
echo -n 'Egljb21tdW5pdHm4AQDKASeyASRVZ2t4NHpXX1o2UWVLVlNuUnpQbW5GNXBBZERJSjR0QmlrU2_qAgQQARgBkgMAqgNbIkcwANgBAeoBJFVna3g0eldfWjZRZUtWU25SelBtbkY1cEFkRElKNHRCaWtTb_IBGFVDMkNoeEhFWkNtSzVOajRKQjY0OWlLQUIQY29tbWVudHMtc2VjdGlvbvIGBAoCSgA=' | base64url -d | protoc --decode_raw
Concerning the response:
jq keys b
[
"frameworkUpdates",
"onResponseReceivedEndpoints",
"responseContext",
"trackingParams"
]
I checked in the following frameworkUpdates
and onResponseReceivedEndpoints
.
/frameworkUpdates/entityBatchUpdate/timestamp
:
date -d @1722772289
Sun Aug 4 01:51:29 PM CEST 2024
echo -n '4qmFsgK1ARIYVUMyQ2h4SEVaQ21LNU5qNEpCNjQ5aUtBGpgBRWdsamIyMXRkVzVwZEhtcUExOGlTVEFBZUFMSUFRRHFBU1JWWjJ0NE5IcFhYMW8yVVdWTFZsTnVVbnBRYlc1R05YQkJaRVJKU2pSMFFtbHJVMl95QVJoVlF6SkRhSGhJUlZwRGJVczFUbW8wU2tJMk5EbHBTMEU0QVVJUVkyOXRiV1Z1ZEhNdGMyVmpkR2x2YmclM0QlM0Q=' | base64url -d | protoc --decode_raw
echo -n 'Egljb21tdW5pdHmqA18iSTAAeALIAQDqASRVZ2t4NHpXX1o2UWVLVlNuUnpQbW5GNXBBZERJSjR0QmlrU2_yARhVQzJDaHhIRVpDbUs1Tmo0SkI2NDlpS0E4AUIQY29tbWVudHMtc2VjdGlvbg==' | base64url -d | protoc --decode_raw
echo -n '4qmFsgK1ARIYVUMyQ2h4SEVaQ21LNU5qNEpCNjQ5aUtBGpgBRWdsamIyMXRkVzVwZEhtcUExOGlTVEFCZUFMSUFRRHFBU1JWWjJ0NE5IcFhYMW8yVVdWTFZsTnVVbnBRYlc1R05YQkJaRVJKU2pSMFFtbHJVMl95QVJoVlF6SkRhSGhJUlZwRGJVczFUbW8wU2tJMk5EbHBTMEU0QVVJUVkyOXRiV1Z1ZEhNdGMyVmpkR2x2YmclM0QlM0Q=' | base64url -d | protoc --decode_raw
echo -n 'Egljb21tdW5pdHmqA18iSTABeALIAQDqASRVZ2t4NHpXX1o2UWVLVlNuUnpQbW5GNXBBZERJSjR0QmlrU2_yARhVQzJDaHhIRVpDbUs1Tmo0SkI2NDlpS0E4AUIQY29tbWVudHMtc2VjdGlvbg==' | base64url -d | protoc --decode_raw
Next page of community posts:
curl 'https://www.youtube.com/youtubei/v1/browse?prettyPrint=false' --compressed -X POST -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:128.0) Gecko/20100101 Firefox/128.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' -H 'Accept-Encoding: gzip, deflate, br, zstd' -H 'Referer: https://www.youtube.com/@Benjamin-xq/community' -H 'Content-Type: application/json' -H 'X-Goog-EOM-Visitor-Id: CgttRHpBU09DUHVHQSi_1r21BjIiCgJGUhIcEhgSFhMLFBUWFwwYGRobHB0eHw4PIBAREiEgDw%3D%3D' -H 'X-Youtube-Bootstrap-Logged-In: false' -H 'X-Youtube-Client-Name: 1' -H 'X-Youtube-Client-Version: 2.20240731.04.00' -H 'Origin: https://www.youtube.com' -H 'DNT: 1' -H 'Sec-GPC: 1' -H 'Connection: keep-alive' -H 'Cookie: SOCS=CAESNQgDEitib3FfaWRlbnRpdHlmcm9udGVuZHVpc2VydmVyXzIwMjQwNzMwLjA1X3AwGgJlbiACGgYIgIm7tQY; YSC=mCkhd8P4OAM; __Secure-YEC=CgttRHpBU09DUHVHQSi_1r21BjIiCgJGUhIcEhgSFhMLFBUWFwwYGRobHB0eHw4PIBAREiEgDw%3D%3D; VISITOR_PRIVACY_METADATA=CgJGUhIcEhgSFhMLFBUWFwwYGRobHB0eHw4PIBAREiEgDw%3D%3D; PREF=f4=4000000&f6=40000000&tz=Europe.Paris' -H 'Sec-Fetch-Dest: empty' -H 'Sec-Fetch-Mode: same-origin' -H 'Sec-Fetch-Site: same-origin' -H 'Priority: u=4' -H 'TE: trailers' --data-raw '{"context":{"client":{"hl":"en","gl":"FR","remoteHost":"2a01:cb04:609:5d00:188:4bc8:4e3d:699c","deviceMake":"","deviceModel":"","visitorData":"CgttRHpBU09DUHVHQSi_1r21BjIiCgJGUhIcEhgSFhMLFBUWFwwYGRobHB0eHw4PIBAREiEgDw%3D%3D","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:128.0) Gecko/20100101 Firefox/128.0,gzip(gfe)","clientName":"WEB","clientVersion":"2.20240731.04.00","osName":"X11","osVersion":"","originalUrl":"https://www.youtube.com/channel/UC2ChxHEZCmK5Nj4JB649iKA/community?lb=Ugkx4zW_Z6QeKVSnRzPmnF5pAdDIJ4tBikSo","screenPixelDensity":2,"platform":"DESKTOP","clientFormFactor":"UNKNOWN_FORM_FACTOR","configInfo":{"appInstallData":"CL_WvbUGENqgsQUQ9quwBRDvzbAFEK-hsQUQqLewBRCx3LAFEMn3rwUQpaWxBRDd6P4SEKXC_hIQt--vBRCDuf8SEJCSsQUQ1KGvBRCmk7EFEKrYsAUQlpWwBRDrmbEFEN_1sAUQyeawBRDviLEFEO6irwUQ1YiwBRC2sf8SEJKdsQUQ6sOvBRDBpbEFEMr5sAUQsO6wBRC9tq4FENuvrwUQ0PqwBRCinbEFEMefsQUQsKqxBRD0q7AFENCNsAUQ1-mvBRComrAFEKaasAUQj8SwBRDwnLAFEJ7QsAUQlqOxBRCI468FEMSSsQUQ_IWwBRC9mbAFEKiTsQUQmZixBRC36v4SEOPRsAUQ65OuBRDW3bAFEInorgUQ4bz_EhD_iLEFEI7asAUQooGwBRCU_rAFENqlsQUQydewBRDM364FEOLUrgUQvoqwBRCokrEFENnJrwUQzdewBRCNzLAFEI2UsQUQnaawBRDr6P4SELn4sAUQ0-GvBRDb_rciEJrwrwUQi8-wBRD68LAFEJSJsQUQ4eywBRCIh7AFEOX0sAUQppKxBRCPlLEFEJuosQUQ-oKxBSogQ0FNU0VoVUpvTDJ3RE5Ia0J2UHQ4UXVCOXdFZEJ3PT0%3D"},"screenDensityFloat":2,"userInterfaceTheme":"USER_INTERFACE_THEME_DARK","timeZone":"Europe/Paris","browserName":"Firefox","browserVersion":"128.0","acceptHeader":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/png,image/svg+xml,*/*;q=0.8","deviceExperimentId":"ChxOek01T1RJMU1EWXpNalV6TXpZNE56WXhOZz09EL_WvbUGGL_WvbUG","screenWidthPoints":1128,"screenHeightPoints":315,"utcOffsetMinutes":120,"mainAppWebInfo":{"graftUrl":"https://www.youtube.com/@Benjamin-xq/community","pwaInstallabilityStatus":"PWA_INSTALLABILITY_STATUS_UNKNOWN","webDisplayMode":"WEB_DISPLAY_MODE_BROWSER","isWebNativeShareAvailable":false}},"user":{"lockedSafetyMode":false},"request":{"useSsl":true,"internalExperimentFlags":[],"consistencyTokenJars":[]},"clickTracking":{"clickTrackingParams":"CBoQuy8YACITCK2muby224cDFdEZ8QUdTOw0ew=="},"adSignalsInfo":{"params":[{"key":"dt","value":"1722772287758"},{"key":"flash","value":"0"},{"key":"frm","value":"0"},{"key":"u_tz","value":"120"},{"key":"u_his","value":"4"},{"key":"u_h","value":"752"},{"key":"u_w","value":"1128"},{"key":"u_ah","value":"712"},{"key":"u_aw","value":"1128"},{"key":"u_cd","value":"24"},{"key":"bc","value":"31"},{"key":"bih","value":"315"},{"key":"biw","value":"1128"},{"key":"brdim","value":"0,0,0,0,1128,0,1128,684,1128,315"},{"key":"vis","value":"1"},{"key":"wgl","value":"true"},{"key":"ca_type","value":"image"}]}},"continuation":"4qmFsgKNARIYVUMyQ2h4SEVaQ21LNU5qNEpCNjQ5aUtBGlhFZ2xqYjIxdGRXNXBkSG1xQXlnS0pGRXlhRU5TUms1WVRsWnNUMDFJYUU5VmFrSlhWVWRHZEdWSVRtbE5WVnBGVWxWR1FpZ0s4Z1lFQ2dKS0FBJTNEJTNEmgIWYmFja3N0YWdlLWl0ZW0tc2VjdGlvbg%3D%3D"}'
echo -n '4qmFsgKNARIYVUMyQ2h4SEVaQ21LNU5qNEpCNjQ5aUtBGlhFZ2xqYjIxdGRXNXBkSG1xQXlnS0pGRXlhRU5TUms1WVRsWnNUMDFJYUU5VmFrSlhWVWRHZEdWSVRtbE5WVnBGVWxWR1FpZ0s4Z1lFQ2dKS0FBJTNEJTNEmgIWYmFja3N0YWdlLWl0ZW0tc2VjdGlvbg==' | base64url -d | protoc --decode_raw
echo -n 'Egljb21tdW5pdHmqAygKJFEyaENSRk5YTlZsT01IaE9VakJXVUdGdGVITmlNVVpFUlVGQigK8gYECgJKAA==' | base64url -d | protoc --decode_raw
echo -n 'Q2hCRFNXNVlOMHhOUjBWUGFteHNiMUZERUFB' | base64url -d
ChBDSW5YN0xNR0VPamxsb1FDEAA
echo -n 'Q2hCRFNXNVlOMHhOUjBWUGFteHNiMUZERUFB' | base64url -d | base64 -d
CInX7LMGEOjlloQCbase64: invalid input
echo -n 'ChBDSW5YN0xNR0VPamxsb1FDEAA=' | base64 -d | protoc --decode_raw
1: "CInX7LMGEOjlloQC"
2: 0
echo -n 'CInX7LMGEOjlloQC' | base64 -d | protoc --decode_raw
1: 1719348105
2: 545633000
date -d @1719348105
Tue Jun 25 10:41:45 PM CEST 2024
looks interesting.
The question is is the same timestamp always returned and where does it come from? The same timestamp is always returned.
getJSONPathFromKey community.html token
202 /contents/twoColumnBrowseResultsRenderer/tabs/2/tabRenderer/content/sectionListRenderer/contents/0/itemSectionRenderer/contents/10/continuationItemRenderer/continuationEndpoint/continuationCommand/token 4qmFsgKVARIYVUMyQ2h4SEVaQ21LNU5qNEpCNjQ5aUtBGmBFZ2xqYjIxdGRXNXBkSG00QVFDU0F3Q3FBeWdLSkZFeWFFTlNSazVZVGxac1QwMUlhRTlWYWtKWFZVZEdkR1ZJVG1sTlZWcEZVbFZHUWlnSzhnWUpDZ2RLQUtJQkFnZ0KaAhZiYWNrc3RhZ2UtaXRlbS1zZWN0aW9u
369 /header/pageHeaderRenderer/content/pageHeaderViewModel/description/descriptionPreviewViewModel/rendererContext/commandContext/onTap/innertubeCommand/showEngagementPanelEndpoint/engagementPanel/engagementPanelSectionListRenderer/content/sectionListRenderer/contents/0/itemSectionRenderer/contents/0/continuationItemRenderer/continuationEndpoint/continuationCommand/token 4qmFsgJgEhhVQzJDaHhIRVpDbUs1Tmo0SkI2NDlpS0EaRDhnWXJHaW1hQVNZS0pEWTRORE5qTWpJd0xUQXdNREF0TW1KbE5TMDROV1JsTFdZME1ETXdORE5rTmpVMU9BJTNEJTNE
/contents/twoColumnBrowseResultsRenderer/tabs/2/tabRenderer/content/sectionListRenderer/contents/0/itemSectionRenderer/contents/10/continuationItemRenderer/continuationEndpoint/continuationCommand/token
Can we somehow leverage a dichotomy to request results after a given datetime?
https://www.youtube.com/post/Ugwu1_hVsgkc3x-6V9B4AaABCQ
curl https://www.youtube.com/youtubei/v1/browse -H 'Content-Type: application/json' --data-raw '{"context": {"client": {"clientName": "WEB", "clientVersion": "2.20240731.04.00"}}, "continuation": "4qmFsgKNARIYVUNReEpzQWxxbUJQQWJSXzBzeURpOW1nGlhFZ2xqYjIxdGRXNXBkSG1xQXlnS0pGRXlZelZTUmxKelZXMXNhbEo2YkVOVmFrSlhWRzFHY2sxWVVtRk5iVTVTVVZWRlBTZ284Z1lFQ2dKS0FBJTNEJTNEmgIWYmFja3N0YWdlLWl0ZW0tc2VjdGlvbg%3D%3D"}'
import requests
import blackboxprotobuf
import base64
def getBase64Protobuf(message, typedef):
data = blackboxprotobuf.encode_message(message, typedef)
return base64.b64encode(data).decode('ascii')
message = {
'1': 1611247060,
}
typedef = {
'1': {
'type': 'int'
},
}
one = getBase64Protobuf(message, typedef)
message = {
'1': one,
}
typedef = {
'1': {
'type': 'string'
},
}
one = base64.b64encode(getBase64Protobuf(message, typedef).encode('ascii'))
message = {
'2': 'community',
'53': {
'1': one,
},
}
typedef = {
'2': {
'type': 'string'
},
'53': {
'type': 'message',
'message_typedef': {
'1': {
'type': 'string'
},
},
},
}
three = getBase64Protobuf(message, typedef)
message = {
'80226972': {
'2': 'UCQxJsAlqmBPAbR_0syDi9mg',
'3': three,
}
}
typedef = {
'80226972': {
'type': 'message',
'message_typedef': {
'2': {
'type': 'string'
},
'3': {
'type': 'string'
},
},
'field_order': [
'2',
'3',
]
}
}
continuation = getBase64Protobuf(message, typedef)
json_data = {
'context': {
'client': {
'clientName': 'WEB',
'clientVersion': '2.20240731.04.00',
},
},
'continuation': continuation,
}
response = requests.post('https://www.youtube.com/youtubei/v1/browse', headers=headers, json=json_data)
print('There is a lot going on behind the shiny veil of innovation and flagship phones.' in response.text)
jq keys browse.json
[
"metadata",
"microformat",
"onResponseReceivedEndpoints",
"responseContext",
"trackingParams"
]
echo -n 'Egljb21tdW5pdHnKASeyASRVZ2t4ajZZUE5qcUFkdTFnOHgyVFBSYnk2dTc2S08tTkJwQ27qAgQQARgB' | base64 -d | protoc --decode_raw
echo -n 'qgcoCAMSJFVna3hqNllQTmpxQWR1MWc4eDJUUFJieTZ1NzZLTy1OQnBDbg==' | base64 -d | protoc --decode_raw
echo -n 'Egljb21tdW5pdHnKASeyASRVZ2t4aXFvWUJ1aE5KTG9HN1Y0RDlDWWpjMC1KVGYxZE1SZVHqAgQQARgB' | base64 -d | protoc --decode_raw
echo -n 'qgcoCAMSJFVna3hpcW9ZQnVoTkpMb0c3VjREOUNZamMwLUpUZjFkTVJlUQ==' | base64 -d | protoc --decode_raw
how many entries have been returned in fact?
I verified metadata
, microformat
and onResponseReceivedEndpoints
.
jq keys community.json
[
"contents",
"header",
"metadata",
"microformat",
"responseContext",
"topbar",
"trackingParams"
]
echo -n 'EgdzdHJlYW1z8gYECgJ6AA==' | base64 -d | protoc --decode_raw
echo -n 'Egljb21tdW5pdHnyBgQKAkoA' | base64 -d | protoc --decode_raw
echo -n 'Egljb21tdW5pdHnyBgkKB0oAogECCAE=' | base64 -d | protoc --decode_raw
echo -n 'Egljb21tdW5pdHnKASeyASRVZ2t4UzMwOHprX1ZsRDJxUU9GcmxDR0xKZDVZYnRhWUJiOUbqAgQQARgB' | base64 -d | protoc --decode_raw
echo -n 'qgcoCAMSJFVna3hTMzA4emtfVmxEMnFRT0ZybENHTEpkNVlidGFZQmI5Rg==' | base64 -d | protoc --decode_raw
echo -n 'QUFFLUhqbnBRYUZGS1lCdmxzNlh1SjFJbkxCN3JxN25Zd3xBQ3Jtc0tudzZ6U0FFZWktZW1qaFg5VVZ1b0Q5ZU5pNEZaMUF3SzR3YlFwMHVhRmFhMTVuMEZXbDh1c0JsYnY1UzR1YnBMRVdvVHdrYmpXUW9CeVFNLVlkNlJUUVRxY0U3bE9SOEQtd0RGaF9EWHg3RThHOG9kWDB6Rk1XRVlOX0tLRlJOdGxubTl0QzRPLWFNNFdFVElwZmVJTGd0a1Y4WEQ0R2o2OF8tOUdiMU1ZcHp2LXZUWjE2MjAwdUtzMWlqNjkzYmtxbDUydmw=' | base64 -d
AAE-HjnpQaFFKYBvls6XuJ1InLB7rq7nYw|ACrmsKnw6zSAEei-emjhX9UVuoD9eNi4FZ1AwK4wbQp0uaFaa15n0FWl8usBlbv5S4ubpLEWoTwkbjWQoByQM-Yd6RTQTqcE7lOR8D-wDFh_DXx7E8G8odX0zFMWEYN_KKFRNtlnm9tC4O-aM4WETIpfeILgtkV8XD4Gj68_-9Gb1MYpzv-vTZ16200uKs1ij693bkql52vl
Unable to decode further with base64{,url}
.
echo -n '4qmFsgJgEhhVQzJDaHhIRVpDbUs1Tmo0SkI2NDlpS0EaRDhnWXJHaW1hQVNZS0pEWTRNMk5qT1dWbUxUQXdNREF0TWpBMllpMDVORFl3TFRFMFl6RTBaV1kwTURVMll3JTNEJTNE' | base64 -d | protoc --decode_raw
echo -n '8gYrGimaASYKJDY4M2NjOWVmLTAwMDAtMjA2Yi05NDYwLTE0YzE0ZWY0MDU2Yw==' | base64 -d | protoc --decode_raw
echo -n 'EgZ0b3BiYXIg9QEoAQ==' | base64 -d | protoc --decode_raw
I checked first returned item, contents
, header
(seems not community post focused), metadata
and microformat
.
import requests
import blackboxprotobuf
import base64
def getBase64Protobuf(message, typedef):
data = blackboxprotobuf.encode_message(message, typedef)
return base64.b64encode(data).decode('ascii')
def getCommunity(timestamp):
message = {
'1': timestamp,
}
typedef = {
'1': {
'type': 'int'
},
}
one = getBase64Protobuf(message, typedef)
message = {
'1': one,
}
typedef = {
'1': {
'type': 'string'
},
}
one = base64.b64encode(getBase64Protobuf(message, typedef).encode('ascii'))
message = {
'2': 'community',
'53': {
'1': one,
},
}
typedef = {
'2': {
'type': 'string'
},
'53': {
'type': 'message',
'message_typedef': {
'1': {
'type': 'string'
},
},
},
}
three = getBase64Protobuf(message, typedef)
message = {
'80226972': {
'2': 'UCgvqvBoSHB1ctlyyhoHrGwQ',
'3': three,
}
}
typedef = {
'80226972': {
'type': 'message',
'message_typedef': {
'2': {
'type': 'string'
},
'3': {
'type': 'string'
},
},
'field_order': [
'2',
'3',
]
}
}
continuation = getBase64Protobuf(message, typedef)
json_data = {
'context': {
'client': {
'clientName': 'WEB',
'clientVersion': '2.20240731.04.00',
},
},
'continuation': continuation,
}
response = requests.post('https://www.youtube.com/youtubei/v1/browse', json = json_data)
return response.json()
community = getCommunity(1722782039)
print('ce dimanche pour cause de vacances en famille' in str(community))
As if use future timestamp such as 1732782039
it still returns True
, I believe that this value is an upperbound of results returned.
We would like a code supporting all community posts, not all except the most recent one etc.
I start feeling it is not deterministic.
timestamp = 1722782903
TO_REMOVE = 1
while True:
community = getCommunity(timestamp)
isIn = 'ce dimanche pour cause de vacances en famille' in str(community)
print(f'{timestamp=} {isIn=}')
if not isIn:
break
timestamp -= TO_REMOVE
timestamp=1722782903 isIn=True
timestamp=1722782902 isIn=True
...
timestamp=1722782896 isIn=True
timestamp=1722782895 isIn=False
but with TO_REMOVE = 10
:
timestamp=1722782903 isIn=True
timestamp=1722782893 isIn=True
timestamp=1722782883 isIn=True
timestamp=1722782873 isIn=True
timestamp=1722782863 isIn=False
In fact if do not break
then get:
isIn=True
except sometimes isIn=False
let us return the first community post content to maybe better understand.
communityPostMessage = community['contents']['twoColumnBrowseResultsRenderer']['tabs'][5]['tabRenderer']['content']['sectionListRenderer']['contents'][0]['itemSectionRenderer']['contents'][0]['backstagePostThreadRenderer']['post']['backstagePostRenderer']['contentText']['runs'][0]['text']
is not relevant as not from AJAX.
timestamp
Python decrease by 1KeyError: 'backstagePostThreadRenderer'
timestamp
Python decrease by 100isIn=True
s then isIn=False
sAfter about 10 executions I confirm that This channel hasn't posted yet seems deterministic.
Recently I tried automatically create community posts but do not remember where I keep track of that. I remember having achieved an algorithm blocked due to bot verification after a few posts. I stopped investigating mentioning that it takes too much of my time but I do not quickly find it on Discord.
An approximative timestamp is when switching from is in to is not more in, no matter if most recent or not comment.
So:
timestamp | timestamp + 1 | possible | next timestamp |
---|---|---|---|
False | False | True | |
False | True | True | |
True | False | True | |
True | True | True |
According to my understanding, booleans standing for is in.
Dichotomy between channel creation and now.
Maybe the most simple is to guess all community post creation time.
Even if start from most recent comment and decrease exponentially time, cannot know if missed one.
Let us assume multiple community posts to start.
Related to #257.
community['backstagePostThreadRenderer']['post']['backstagePostRenderer']['postId']
does not seem useful.
YOUTUBE_OPERATIONAL_API_INSTANCE_URL = 'http://localhost/YouTube-operational-API'
communityPostIds = []
params = {
'part': 'community',
'handle': '@Amixem',
}
while True:
data = requests.get(YOUTUBE_OPERATIONAL_API_INSTANCE_URL + '/channels', params).json()
#print(json.dumps(data, indent = 4))
item = data['items'][0]
for communityPost in item['community']:
communityPostIds += [communityPost['id']]
if not 'nextPageToken' in item:
break
params['pageToken'] = item['nextPageToken']
currentTimestamp = time.time()
timestamp = math.ceil(currentTimestamp)
toRemove = 1
while True:
community = getCommunity(timestamp)
communityDatetime = datetime.fromtimestamp(timestamp)
print(communityDatetime)
content = community['continuationContents']['itemSectionContinuation']['contents'][0]
if 'messageRenderer' in content and content['messageRenderer']['text']['runs'][0]['text'] == "This channel hasn't posted yet":
timestamp -= 1
continue
if community['continuationContents']['itemSectionContinuation']['contents'][0]['backstagePostThreadRenderer']['post']['backstagePostRenderer']['postId'] != 'UgkxOpujSABK9-1yzNZHml1PkEmExobp1s8Z':
print(datetime.fromtimestamp(currentTimestamp) - communityDatetime)
break
timestamp -= toRemove
toRemove *= 2
print(json.dumps(getCommunity(1000)['continuationContents']['itemSectionContinuation']['contents'][0], indent = 4))
shift
:https://www.youtube.com/post/UgkxyQX6cJcKVEPItVA3rjwpwMUK6YejXTsI
date -d @1722783810
Sun Aug 4 05:03:30 PM CEST 2024
shift
So a precision of 3 seconds, while knowing that manually clicked on POST, have not taken into account sending the request and YouTube treating the community post creation, it seems quite perfect.
True boss
Following my Discord message about me testing on macOS.
Related to Benjamin_Loison/MacOS/issues/1.
See Benjamin_Loison/blackboxprotobuf/issues/1.
TLDR: use Pypi: bbpb.
Note that #257 pagination seems to provide precise date of first or last community post of each page. Maybe by finding the precise date of each first page community posts, thanks to pagination can get the ones of all.
As requested on Discord.
https://www.youtube.com/post/Ugkx4zW_Z6QeKVSnRzPmnF5pAdDIJ4tBikSo:
Ugkx4zW_Z6QeKVSnRzPmnF5pAdDIJ4tBikSo
does not seem base64 encoded.There is no tooltip and the source code of the shown date does not seem interesting:
I have an idea that I have not investigated yet that is investigate everything encoded notably in Protobuf, notably for continuation tokens.
The whole community post code:
```htmlAIdro_k6E7UTsF655DeERRis1BYvl8uSsF973x_Fdovm3dYObcg6BnyGOi5TiYl_QS1VWQCHTQ
andAIdro_k6E7UTsF655DeERRis1BYvl8uSsF973x
are not base 64 encoded otherwise I manually checked all this code.Have to invesitgate both the precise community post webpage and the community tab webpage.
What difference with above URL and https://www.youtube.com/channel/UC2ChxHEZCmK5Nj4JB649iKA/community?lb=Ugkx4zW_Z6QeKVSnRzPmnF5pAdDIJ4tBikSo one?
Output:
``` HTTP/2 200 content-type: text/html; charset=utf-8 x-content-type-options: nosniff cache-control: no-cache, no-store, max-age=0, must-revalidate pragma: no-cache expires: Mon, 01 Jan 1990 00:00:00 GMT date: Sun, 04 Aug 2024 11:47:19 GMT content-length: 507365 strict-transport-security: max-age=31536000 x-frame-options: SAMEORIGIN permissions-policy: ch-ua-arch=*, ch-ua-bitness=*, ch-ua-full-version=*, ch-ua-full-version-list=*, ch-ua-model=*, ch-ua-wow64=*, ch-ua-form-factors=*, ch-ua-platform=*, ch-ua-platform-version=* cross-origin-opener-policy: same-origin-allow-popups; report-to="youtube_main" origin-trial: AmhMBR6zCLzDDxpW+HfpP67BqwIknWnyMOXOQGfzYswFmJe+fgaI6XZgAzcxOrzNtP7hEDsOo1jdjFnVr2IdxQ4AAAB4eyJvcmlnaW4iOiJodHRwczovL3lvdXR1YmUuY29tOjQ0MyIsImZlYXR1cmUiOiJXZWJWaWV3WFJlcXVlc3RlZFdpdGhEZXByZWNhdGlvbiIsImV4cGlyeSI6MTc1ODA2NzE5OSwiaXNTdWJkb21haW4iOnRydWV9 report-to: {"group":"youtube_main","max_age":2592000,"endpoints":[{"url":"https://csp.withgoogle.com/csp/report-to/youtube_main"}]} p3p: CP="This is not a P3P policy! See http://support.google.com/accounts/answer/151657?hl=fr for more info." server: ESF x-xss-protection: 0 set-cookie: YSC=86fipQsYId0; Domain=.youtube.com; Path=/; Secure; HttpOnly; SameSite=none set-cookie: __Secure-YEC=CgtaX1lfQk5mSUlTSSjH1L21BjIiCgJGUhIcEhgSFhMLFBUWFwwYGRobHB0eHw4PIBAREiEgQQ%3D%3D; Domain=.youtube.com; Expires=Wed, 03-Sep-2025 11:47:18 GMT; Path=/; Secure; HttpOnly; SameSite=lax set-cookie: VISITOR_PRIVACY_METADATA=CgJGUhIcEhgSFhMLFBUWFwwYGRobHB0eHw4PIBAREiEgQQ%3D%3D; Domain=.youtube.com; Expires=Wed, 03-Sep-2025 11:47:19 GMT; Path=/; Secure; HttpOnly; SameSite=none set-cookie: VISITOR_INFO1_LIVE=; Domain=.youtube.com; Expires=Mon, 08-Nov-2021 11:47:19 GMT; Path=/; Secure; HttpOnly; SameSite=none alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 ```Output:
``` HTTP/2 200 content-type: text/html; charset=utf-8 x-content-type-options: nosniff cache-control: no-cache, no-store, max-age=0, must-revalidate pragma: no-cache expires: Mon, 01 Jan 1990 00:00:00 GMT date: Sun, 04 Aug 2024 11:47:50 GMT content-length: 519755 strict-transport-security: max-age=31536000 x-frame-options: SAMEORIGIN permissions-policy: ch-ua-arch=*, ch-ua-bitness=*, ch-ua-full-version=*, ch-ua-full-version-list=*, ch-ua-model=*, ch-ua-wow64=*, ch-ua-form-factors=*, ch-ua-platform=*, ch-ua-platform-version=* origin-trial: AmhMBR6zCLzDDxpW+HfpP67BqwIknWnyMOXOQGfzYswFmJe+fgaI6XZgAzcxOrzNtP7hEDsOo1jdjFnVr2IdxQ4AAAB4eyJvcmlnaW4iOiJodHRwczovL3lvdXR1YmUuY29tOjQ0MyIsImZlYXR1cmUiOiJXZWJWaWV3WFJlcXVlc3RlZFdpdGhEZXByZWNhdGlvbiIsImV4cGlyeSI6MTc1ODA2NzE5OSwiaXNTdWJkb21haW4iOnRydWV9 cross-origin-opener-policy: same-origin-allow-popups; report-to="youtube_main" report-to: {"group":"youtube_main","max_age":2592000,"endpoints":[{"url":"https://csp.withgoogle.com/csp/report-to/youtube_main"}]} p3p: CP="This is not a P3P policy! See http://support.google.com/accounts/answer/151657?hl=fr for more info." server: ESF x-xss-protection: 0 set-cookie: YSC=FenNnZMWRv4; Domain=.youtube.com; Path=/; Secure; HttpOnly; SameSite=none set-cookie: __Secure-YEC=Cgs4cHRrREthTVQybyjm1L21BjIiCgJGUhIcEhgSFhMLFBUWFwwYGRobHB0eHw4PIBAREiEgZQ%3D%3D; Domain=.youtube.com; Expires=Wed, 03-Sep-2025 11:47:49 GMT; Path=/; Secure; HttpOnly; SameSite=lax set-cookie: VISITOR_PRIVACY_METADATA=CgJGUhIcEhgSFhMLFBUWFwwYGRobHB0eHw4PIBAREiEgZQ%3D%3D; Domain=.youtube.com; Expires=Wed, 03-Sep-2025 11:47:50 GMT; Path=/; Secure; HttpOnly; SameSite=none set-cookie: VISITOR_INFO1_LIVE=; Domain=.youtube.com; Expires=Mon, 08-Nov-2021 11:47:50 GMT; Path=/; Secure; HttpOnly; SameSite=none alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 ```As there is the same UI I guess that it is done the same way.
Maybe could also leverage comments but as there is no guarantee to have any maybe it is not leverageable.
Maybe there is some webpage meta-data letting us know that the webpage has not changed since a given date but I doubt so due to UI changes.