Closed ReenigneArcher closed 9 months ago
urllib2.Request
doesn't have an interface for setting the request method. For compat purposes it needs one that matches Python 3 urllib.request.Request
. But the simple-minded code that does this might be falling foul of this situation as described at https://github.com/python/cpython/blob/2.7/Lib/urllib2.py#L247:
# XXX this is a fallback mechanism to guard against these # methods getting called in a non-standard order. this may be # too complicated and/or unnecessary.
(apparently it is necessary in the case above).
The quoted compat code wraps the Request
old-style class with a new-style class whose instances are instances of the replaced class, with a specialised get_method()
if the method
keyword was set in the constructor call (overriding the one at l.256 of urllib2.py
).
This is the context in urllib2.py:
def open(self, fullurl, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
# accept a URL or a Request object
if isinstance(fullurl, basestring):
req = Request(fullurl, data)
else:
req = fullurl
if data is not None:
req.add_data(data)
req.timeout = timeout
protocol = req.get_type() # failing line
...
As expected, it calls the get_type()
of the original class:
def get_type(self):
if self.type is None:
self.type, self.__r_type = splittype(self.__original)
if self.type is None:
raise ValueError, "unknown url type: %s" % self.__original
return self.type
Presumably the first line of the method is being resolved wrongly. The results of the commands p self, type(self)
and dir(self)
at a debugger break on the marked failing line would be useful. If self
was constructed with Request(url, ...)
it should have a type
of None
either from the __init__()
of the original Request
class being called directly, or from its invocation in the compat line r = _req(url, ...)
when constructing an instance of the the wrapper class. Similarly, the __original
attribute should have been set, but the line that uses it is never reached.
A similar hack is already done at https://github.com/squaresmile/Plex-Plug-Ins/blob/fc4ab34d4cb995668abd84b304b57c5bf13cb69d/Framework.bundle/Contents/Resources/Versions/2/Python/Framework/components/networking.py#L157, but this shouldn't be a problem. The Plex code appears to go straight into the original urllib2.Request()
code without touching the yt-dl compat.
The results of the commands p self, type(self) and dir(self) at a debugger break on the marked failing line would be useful
I'm not sure if this is possible as it's inside the Plex framework. As far as I know it's not really possible to run this on it's own outside of a Plex environment.
The Plex code appears to go straight into the original urllib2.Request() code without touching the yt-dl compat.
I'd guess that changes when I import youtube_dl?
I'm not really that familiar with the source of youtube-dl or the compat layer... but I would be happy to test any patch/branch that may address this.
Neither the code nor the backtrace from Plex shows any sign of calling yt-dl compat. As in another case recently the order of importing may be relevant.
Try adding this missing attribute hook to the _request
class in compat.py
:
def __getattr__(self, name):
if hasattr(_req, name):
return _req.name
return super(_request, self).__getattr__(name)
In case the lookup of attributes is not working, this should make the instance value default to the class value in the original urllib2.Request
.
As in another case recently the order of importing may be relevant.
I did remove the install_aliases()
method from my backported library so those conflicts should be resolved from #32544
Adding this __getattr__
method to the class produces the same traceback. It seems like it should work to me... I did confirm that commenting out the try/except block (even on the latest commit) fixes the issue though.
So there's something weirder going on. It looked familiar, and then I remembered utils.py l.2729
if sys.version_info < (2, 7): # avoid possible race where __r_type may be unset req.get_type()
(I never found out why that worked, not having a Py2.6 at hand)
So try this:
def __new__(cls, url, *args, **kwargs):
method = kwargs.pop('method', None)
r = _req(url, *args, **kwargs)
+ r.get_type()
if method:
r.get_method = types.MethodType(lambda _: method, r)
return r
Unfortunately, I get the same result.
If you can't connect to the server where the Plex framework is running to use pdb, maybe you can hack some print()
statements into the code and upload that in place of the production code?
You could also try patching the failing line in urllib2:
def get_type(self):
- if self.type is None:
+ if getattr(self, 'type', None) is None
self.type, self.__r_type = splittype(self.__original)
if self.type is None:
raise ValueError, "unknown url type: %s" % self.__original
return self.type
This would have been safer in any case, even if it's difficult to see how get_type()
could possibly be called before __init__()
has completed.
maybe you can hack some print() statements into the code and upload that in place of the production code?
I guess I could, but I have no idea where it would actually print to. There's no console or anything like that... I guess I could make it write some output to a specific file instead.
I will try to get some additional output from the framework. I'm a bit confused as to what is wanted and where though.
print(type(request))
print(dir(request))
I was thinking, just before https://github.com/python/cpython/blob/8d21aa21f2cbc6d50aab3f420bb23be1d081dac4/Lib/urllib2.py#L421. At that point the variable is req
rather than request
; also print(req)
. The backtrace doesn't show HTTP.py. So actually it's the Py2.7 urllib2 that needs to be hacked: I suppose you have authority to do that. Maybe save a copy of the urllib2.py on the server and edit the original file with either the patch above or the diagnostics?
What the BT does show is:
execute(self.init_code)
in sandbox.py:256 launches the plugincls._shared_instance._push_agent_info()
in agentkit.py seems to be getting some information by HTTP packed_result = self._core.networking.http_request(url, cacheTime=0, timeout=None, immediate=True).content
; no explicit opener
is specified, so a default created by urllib2.build_opener()` is usedimmediate
was set, so self.load()
is called and this calls the open()
of the openerOpenerDirector.open()
from urllib2, where the error occurs on calling get_type()
on its req
.As some background, urllib2 "opens" a URL by calling a series of functions within the "opener", each dealing with a specific aspect of the opening task (proxies, cookies, ...) including one (that fails here) to open the network connection itself.
Okay, I haven't got to this yet as it's not so easy to implement in my CI system.
I looked into how plugins are loaded a little bit more, and it's quite interesting. They are loaded using RestrictedPython
and xpython
. I can't find much info about xptyhon
but I think it's this https://pypi.org/project/x-python/
I wonder if this has something to do with the errors. https://github.com/squaresmile/Plex-Plug-Ins/blob/master/Framework.bundle/Contents/Resources/Versions/2/Python/Framework/code/loader.py
xpython seems irrelevant: https://github.com/squaresmile/Plex-Plug-Ins/blob/master/Framework.bundle/Contents/Resources/Platforms/Shared/Libraries/xpython.py
If it's using RestrictedPython, the version must be out of date, since v6 (2022) unsupported Python 2.7. But this is hardly the place to worry about that.
If the sandbox policy messes with getattr
that may be relevant.
exec(code) in self.__dict__
@ReenigneArcher, if you could test this replacement code in compat.py
and report back, I'll try to work a similar fix into the system:
--- old/youtube_dl/compat.py
+++ new/youtube_dl/compat.py
# Also fix up lack of method arg in old Pythons
try:
- _req = compat_urllib_request.Request
- _req('http://127.0.0.1', method='GET')
+ type(compat_urllib_request.Request('http://127.0.0.1', method='GET'))
except TypeError:
- class _request(object):
- def __new__(cls, url, *args, **kwargs):
- method = kwargs.pop('method', None)
- r = _req(url, *args, **kwargs)
- if method:
- r.get_method = types.MethodType(lambda _: method, r)
- return r
-
- compat_urllib_request.Request = _request
-
+ def _add_init_method_arg(cls):
+
+ init = cls.__init__
+
+ def wrapped_init(self, *args, **kwargs):
+ method = kwargs.pop('method', 'GET')
+ init(self, *args, **kwargs)
+ if self.has_data() and method == 'GET':
+ method = 'POST'
+ self.get_method = types.MethodType(lambda _: method, self)
+
+ cls.__init__ = wrapped_init
+
+ _add_init_method_arg(compat_urllib_request.Request)
+ del _add_init_method_arg
@dirkf this change works!
Great, I'll put it through full testing.
PR #32695 contains a similar fix, as in the above commit, and either could be tested.
This is not easy for me to test since it exists in a different repo and I am now using this as a submodule (due to no PyPi package) in my project... if you could make a branch in this repo, I could test it fairly easily.
Maybe just apply commit 0c75367?
@dirkf it looks like applying this as a patch works fine. I didn't change the commit of the submodule, but I can if needed. Currently it's on be008e657d79832642e2158557c899249c9e31cd
Fixed in #32695: thanks for the test.
Checklist
Question
WRITE QUESTION HERE
Starting with commit https://github.com/ytdl-org/youtube-dl/commit/648dc5304cb2476592ff142988b8c62675011fcc I am unable to use youtube-dl in my Plex plugin.
The traceback I get in the Plex plugin logs is not all that useful.
Given the code from the commit:
I'm assuming that the
TypeError
path will be taken in Python 2.7? And given that the overwritten_request
object has no Attributetype
which urllib2 seems to be expecting.This is the last piece of code from the Plex framework before the builtins are called: https://github.com/squaresmile/Plex-Plug-Ins/blob/fc4ab34d4cb995668abd84b304b57c5bf13cb69d/Framework.bundle/Contents/Resources/Versions/2/Python/Framework/components/networking.py#L159
It's difficult to follow the framework code, but I believe this is function from the Plex framework being called by the line in the traceback. https://github.com/squaresmile/Plex-Plug-Ins/blob/fc4ab34d4cb995668abd84b304b57c5bf13cb69d/Framework.bundle/Contents/Resources/Versions/0/Python/PMS/HTTP.py#L119-L148
I'm not 100% sure, but I'm thinking this could be considered a bug in youtube-dl IF it's overwriting built-in types.