user234683 / youtube-local

browser-based client for watching Youtube anonymously and with greater page performance
GNU Affero General Public License v3.0
537 stars 65 forks source link

Error: Could not find player #22

Closed Schabolon closed 3 years ago

Schabolon commented 4 years ago

Hi, I am recently getting the following Error: Could not find player (see screenshot) youtube-local-error There are no error-messages in the console. The error is not persistent so after a couple of Tor-restarts (I am routing through tor) everything works fine again (until it doesn't) If you need any more information (like the youtube-local-configuration), I can of course provide you with that. If you know how to fix that would be awesome, I really enjoy using this project :)

zrose584 commented 4 years ago

logs will be useful ..

Schabolon commented 4 years ago

Here is the log (started the server and got the error when I load the page):

$ python3 server.py 
Running in portable mode
Tor routing is ON
Started httpserver on port 8080
Retrieved comments     Latency: 0.893     Read time: 0.06
127.0.0.1 - - [2020-10-07 19:02:23] "GET /https://www.youtube.com/watch?v=l8WMGBuNaus&list=PLmo4pBukfRoN8SB5RKvfiY9CTl9pI_IFc&index=52 HTTP/1.1" 200 47251 1.052683
127.0.0.1 - - [2020-10-07 19:02:23] "GET /youtube.com/static/js/common.js HTTP/1.1" 304 187 0.000815
127.0.0.1 - - [2020-10-07 19:02:23] "GET /youtube.com/static/js/hotkeys.js HTTP/1.1" 304 187 0.000643
127.0.0.1 - - [2020-10-07 19:02:23] "GET /youtube.com/static/js/transcript-table.js HTTP/1.1" 304 186 0.000567

After that I clicked on the "new identity"-button in tor and than the log looks like the following:

$ python3 server.py 
Running in portable mode
Tor routing is ON
Started httpserver on port 8080
Retrieved comments     Latency: 0.893     Read time: 0.06
127.0.0.1 - - [2020-10-07 19:02:23] "GET /https://www.youtube.com/watch?v=l8WMGBuNaus&list=PLmo4pBukfRoN8SB5RKvfiY9CTl9pI_IFc&index=52 HTTP/1.1" 200 47251 1.052683
127.0.0.1 - - [2020-10-07 19:02:23] "GET /youtube.com/static/js/common.js HTTP/1.1" 304 187 0.000815
127.0.0.1 - - [2020-10-07 19:02:23] "GET /youtube.com/static/js/hotkeys.js HTTP/1.1" 304 187 0.000643
127.0.0.1 - - [2020-10-07 19:02:23] "GET /youtube.com/static/js/transcript-table.js HTTP/1.1" 304 186 0.000567
Retrieved comments     Latency: 0.461     Read time: 0.029
Using cached decryption function for: /s/player/1a1b48e5/player-plasma-ias-phone-en_US.vflset/base.js
127.0.0.1 - - [2020-10-07 19:02:51] "GET /https://www.youtube.com/watch?v=l8WMGBuNaus&list=PLmo4pBukfRoN8SB5RKvfiY9CTl9pI_IFc&index=52 HTTP/1.1" 200 314171 2.302552
127.0.0.1 - - [2020-10-07 19:02:52] "GET /youtube.com/static/js/common.js HTTP/1.1" 304 187 0.002464
127.0.0.1 - - [2020-10-07 19:02:52] "GET /youtube.com/static/js/hotkeys.js HTTP/1.1" 304 187 0.002053
127.0.0.1 - - [2020-10-07 19:02:52] "GET /youtube.com/static/js/transcript-table.js HTTP/1.1" 304 186 0.002081
127.0.0.1 - - [2020-10-07 19:02:53] "GET /youtube.com/static/favicon.ico HTTP/1.1" 200 285 0.001032
127.0.0.1 - - [2020-10-07 19:03:00] "GET /https://r3---sn-ntnxax8xo-cxge.googlevideo.com/videoplayback?expire=1602111770&ei=ufR9X_PsPLTR8gPYr7qYDQ&ip=185.220.101.195&id=o-ALsv8PwjF-0qtyZKABSZ02tayOEZkwTEpwj11uFpwLBy&itag=18&source=youtube&requiressl=yes&mh=Pu&mm=31%2C29&mn=sn-ntnxax8xo-cxge%2Csn-4g5ednld&ms=au%2Crdu&mv=m&mvi=3&pl=26&gcr=ru&initcwndbps=1350000&vprv=1&mime=video%2Fmp4&gir=yes&clen=8062580&ratebypass=yes&dur=170.155&lmt=1575227817137508&mt=1602090040&fvip=4&fexp=23915654&c=MWEB&txp=5431432&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cgcr%2Cvprv%2Cmime%2Cgir%2Cclen%2Cratebypass%2Cdur%2Clmt&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRAIgPlDnsjXftl_lg7tpDJLk2Y_4MsSHy_n-KCQI_jh0n1gCIF61mQ13ZuRBiTNZy_bO_qW24nWgk9i_Mfbo3mga2OMQ&sig=AOq0QJ8wRgIhAJS778bZF-U5_42GSm4wdP-79YL2wHwyYxy7gS-5oKrsAiEApn8kHqxMwqj4GxrKcs9CA-I3MEQWZv0BawngDIZJg_8= HTTP/1.1" 206 8063243 8.091050
Schabolon commented 4 years ago

And here are my settings: youtube-local-settings

zrose584 commented 4 years ago

try https://github.com/zrose584/youtube-local/tree/debug1 ?

user234683 commented 4 years ago

In settings.txt, change debugging_save_responses to True (this is a hidden setting). This will save all responses from Youtube to the disk in the data/debug directory (new ones will replace old ones though, so they won't accumulate). When the error happens, copy the file watch. Make sure the video in question is the last one you opened. The file has the raw json data returned from youtube for that video.

If this error is a recent phenomenon, I'll probably encounter it too within the next few days. I've seen occasional cases where the video urls haven't been included, which give a similar error (I haven't been able to figure out why this happens and whether it's just a bug on Youtube's end), but this looks like none of the info is present for some reason.

Schabolon commented 4 years ago

Ok, so here is the watch file content: {"redirect":"https:\/\/m.youtube.com\/watch?v=bt9585tFr0Q\u0026bpctr=9999999999"}

user234683 commented 4 years ago

Interesting. Some questions. Assuming you don't use new identity to get a new exit node, and the exit node doesn't change on its own:

If it's specific to exit nodes in general, I want to attempt to make the error reoccur within the tor browser (by forcing it to use that exit node by adding it to torrc) so I can examine what happens on the page in these instances, so I'll need to know the exit node.

To find the exit node, first run pip3 install stem then add this code to server.py near the top but after the import statements:

import stem
import stem.control

def stream_event(event):
  if event.status == stem.StreamStatus.SUCCEEDED and event.circ_id:
    circ = controller.get_circuit(event.circ_id)

    exit_fingerprint = circ.path[-1][0]
    exit_relay = controller.get_network_status(exit_fingerprint)

    if exit_relay.address not in event.target:
        print("Exit relay for our connection to %s" % (event.target))
        print("  address: %s:%i" % (exit_relay.address, exit_relay.or_port))
        print("  fingerprint: %s" % exit_relay.fingerprint)
        print("  nickname: %s" % exit_relay.nickname)
        print("  locale: %s" % controller.get_info("ip-to-country/%s" % exit_relay.address, 'unknown'))
        print("")

controller = stem.control.Controller.from_port()
controller.authenticate()
controller.add_event_listener(stream_event, stem.control.EventType.STREAM)

Then in ./youtube/watch.py, at line 219, replace

    polymer_json = util.fetch_url(url, headers=headers, debug_name='watch')

with

    print('-------- extract_info ----------')
    print(url)
    polymer_json = util.fetch_url(url, headers=headers, debug_name='watch')
    print('-------- finished ----------')

When it happens, copy the stuff in the log between the -------- extract_info ---------- and -------- finished ---------- lines. This will give the video url and the exit node used to request the watch page info. Also post the watch debug file for good measure, to make sure it's the same error.

Schabolon commented 4 years ago

The error reoccurs if I refresh the page and it occurs on other videos as well. At first everything works fine, but at some point, I get the error. (I don't change anything). Can the exit node change on it's own?

Here is the log:

-------- extract_info ----------
https://m.youtube.com/watch?v=f7Cpkup4g3I&pbj=1&bpctr=9999999999
Exit relay for our connection to 216.58.207.174:443
  address: 185.220.100.249:9100
  fingerprint: 887CAB501A9DB68A2C44EDF98BF50B0304EED8B6
  nickname: niftykostchtchie
  locale: de

Exit relay for our connection to 216.58.207.174:443
  address: 185.220.100.249:9100
  fingerprint: 887CAB501A9DB68A2C44EDF98BF50B0304EED8B6
  nickname: niftykostchtchie
  locale: de

-------- finished ----------

And here the watch-file content: {"redirect":"https:\/\/m.youtube.com\/watch?v=f7Cpkup4g3I\u0026bpctr=9999999999"}

zrose584 commented 4 years ago

Just noticed I get this error too, without tor.. Maybe I will debug on it later.

zrose584 commented 4 years ago

This seems to be a weird "cooldown" feature from yt. When it fails, the "watch" response is exactly the same but the {"page":"watch","player":.. line is missing. There seems to be no error code/message. I suggest to just retry 3 times (at server side) without delay. Though a warning should be logged..

zrose584 commented 4 years ago

Sometimes I also get Error: Error decrypting url signatures: Could not find player name. The problem is the same, the response doesn't contain the "player" object. Should I open another issues?

Btw, it would be (very) helpful if the playability_error contains the full trace back..

zrose584 commented 4 years ago

Why is extract_info not cached?

user234683 commented 4 years ago

Sometimes I also get Error: Error decrypting url signatures: Could not find player name

This likely means the base.js url is missing (the url has/is the player name). The player object itself, including the urls, was present but no base.js anywhere. We can keep the issues here. In general, there's 3 failure modes I know of:

  1. No player (which has its own embedded player_response containing the urls) AND also no urls in toplevel playerResponse field (sometimes the urls are put there instead). Most common, and I see this error occasionally.
  2. No info at all, json object is just a redirect field (this issue, which is new). Title, description, everything missing.
  3. No player, urls are in playerResponse instead, but they require decryption and base.js is not given (decryption functions are extracted from base.js). Rarer

Just noticed I get this error too, without tor.. This seems to be a weird "cooldown" feature from yt.

By this you mean it only happens if you make lots of requests very fast? And it's the version of the error where everything is missing, not just the player (issue 2)?

Btw, it would be (very) helpful if the playability_error contains the full trace back..

The yt_data_extract module doesn't use exceptions, it just sets an error field if a key is missing. The particular error messages it spits out could probably be more descriptive/consistent, but in general, the full json object is what you want when debugging problems (btw you'll definitely want some kind of software to get a json tree view when debugging). I'll see if I can include a dump of the json object somewhere by default in the next release during errors

Why is extract_info not cached?

Caching is done when something is expected to be reused many times. So the channel id and number of videos are cached since those are needed on each page of the channel. Don't want to be rerequesting it every time the user goes to the next page.

Issues 1 and 3 could be worked around by using get_video_info to retrieve the player directly (this is how age restriction bypass works), though I would like to know why the issue happens here but not on the youtube webpage. It would be nice to figure out what secret sauce the webpage is using for its requests so that it doesn't receive these errors, which is why I want to figure out how to reproduce it reliably so I can try to reproduce it on the mobile webpage. But I can push a temporary fix for 1 and 3. Retrying probably won't help with issue 2 if this is a cooldown problem similar to exit nodes being blocked by 429 codes.

zrose584 commented 4 years ago

I was experiencing issues 1+3.

I tested again, with ~1s interval:

required reloads result
3 issue 3, wait ~2min
0 issue 1
3 issue 3, wait 20s
13 issue 3, wait 20s
12 issue 3, wait 20s
40+ ok

It feels rather random, maybe it depends on their server usage. An idea: If yt-backend detects that loading 'player' takes too long, it sends partial data (so the app can load at least something).. The app gets a token, with which it can request the missing data from the backend.

The yt_data_extract module doesn't use exceptions, it just sets an error field if a key is missing. The particular error messages it spits out could probably be more descriptive/consistent, but in general, the full json object is what you want when debugging problems

A traceback is usefull for people new to the codebase. If you just get an error "XY not found" you have idea who generated it, and in what context. Even pdb wouldn't help, since you don't know where to set the breakpoint. You don't have to use exceptions, just use (e.g.) ''.join(traceback.format_stack()), and put it in an extra field.

But I can push a temporary fix for 1 and 3.

sounds good. Maybe issue 2. can be solved together with #20.

zrose584 commented 4 years ago

Why is extract_info not cached?

Caching is done when something is expected to be reused many times. So the channel id and number of videos are cached since those are needed on each page of the channel. Don't want to be rerequesting it every time the user goes to the next page.

I regularly open the same video multiple times in order to compare different sections with each other. Is there any disadvantage in caching extract_info?

user234683 commented 4 years ago

@Schabolon I'm (slowly) working on fixes for variants 1 and 3 of this error, but I still haven't seen variant 2 of this error (no title, description, anything, with the redirect watch file content). Does that error reoccur often for you? And, normally if you click the "More info" dropdown, do you see an ipv4 address for the tor exit node, or an ipv6 address?

@zrose584 >Is there any disadvantage in caching extract_info?

No, but we would also want to cache the parallel comments request as well. And you wouldn't want it to cache the info for failed responses such as in errors like these where the info is missing. And you would want a hidden setting to disable it for debugging purposes.

Schabolon commented 4 years ago

@user234683 I get the error with variant 2 occasionally.

And, normally if you click the "More info" dropdown, do you see an ipv4 address for the tor exit node, or an ipv6 address?

I get the following: youtube-local-error-more-info

Even though there is no information about the video whatsoever, the comments have been loaded successfully.

user234683 commented 4 years ago

I meant normally, without the error, is the Tor exit node an ipv6 address?

Schabolon commented 4 years ago

Without the error, I have ipv4 addresses (like 162.247.73.192)