Open mousemin opened 5 years ago
他会导致网站的js会加载不完整
This issue actually caused Splash to stuck and stops responding. And even ignoring the restart command.
The same issue still happens.
It would be great if one of you managed to provide a set of steps to reliably reproduce this issue. If you do, it would be much easier to fix the issue.
Splash version: 3.2 Qt 5.9.1, PyQt 5.9, WebKit 602.1, sip 4.19.3, Twisted 16.1.1, Lua 5.2 Python 3.5.2 (default, Nov 23 2017, 16:37:01) [GCC 5.4.0 20160609]
I run splash with command docker run -p 8050:8050 scrapinghub/splash
After some count of requests execution I receive a lot of errors in splash log like:
loadFinished: RenderErrorInfo(type='Network', code=99, text='Proxy connection refused', url='https://...')
I rotate proxy servers on each request using next lua script:
function set_proxy(splash)
splash:on_request(function(request)
request:set_proxy{
host=splash.args.proxy_host,
port=splash.args.proxy_port,
username=splash.args.proxy_user,
password=splash.args.proxy_pass,
}
request:set_header('Proxy-Authorization', splash.args.proxy_auth)
end)
return 1
end
I also rotate user-agent http header using next lua script:
function set_user_agent(splash)
splash:on_request(function(request)
request:set_header('User-Agent', splash.args.user_agent)
end)
return 1
end
I understand that it's OK because some of my proxies can expire. But I suppose that it could be related to the main error.
In ~15-20 minutes I start to receive 504 response. But some requests still work. In a pair of minutes I receive just 504 errors. I also tried to run the docker container with next command docker run -p 8050:8050 scrapinghub/splash --max-timeout 3600
regarding to the manual. But the issue still happens.
When 504 error happens in the log I can rarely see the topic error.
There are some errors from the log that I can reproduce now:
Unhandled error in Deferred:
Unhandled Error
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/defer.py", line 588, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/app/splash/pool.py", line 47, in _start_render
pool_d.addBoth(self._close_render, render, slot)
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/defer.py", line 340, in addBoth
callbackKeywords=kw, errbackKeywords=kw)
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/defer.py", line 306, in addCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/defer.py", line 588, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/app/splash/pool.py", line 69, in _close_render
render.close()
File "/app/splash/qtrender_lua.py", line 2462, in close
self.splash.clear()
builtins.AttributeError: 'LuaRender' object has no attribute 'splash'
error: result is already returned
019-07-10 18:15:43.497252 [network-manager] Traceback (most recent call last):
File "/app/splash/network_manager.py", line 453, in createRequest
request = middleware.process(request, render_options, operation, outgoingData)
File "/app/splash/request_middleware.py", line 26, in process
allowed_domains = render_options.get_allowed_domains()
File "/app/splash/render_options.py", line 334, in get_allowed_domains
allowed_domains = self.get("allowed_domains", default=None)
File "/app/splash/render_options.py", line 87, in get
value = self.data.get(name)
AttributeError: 'RenderOptions' object has no attribute 'data'
internal error in _createRequest middleware
Traceback (most recent call last):
File "/app/splash/network_manager.py", line 112, in createRequest
return self._createRequest(operation, request, outgoingData=outgoingData)
File "/app/splash/network_manager.py", line 142, in _createRequest
self._handle_custom_proxies(request)
File "/app/splash/network_manager.py", line 224, in _handle_custom_proxies
proxy = splash_proxy_factory.queryProxy(proxy_query)[0]
File "/app/splash/proxy.py", line 38, in queryProxy
if self.should_use_proxy_list(protocol, url):
File "/app/splash/proxy.py", line 43, in should_use_proxy_list
if not self.proxy_list:
AttributeError: 'ProfilesSplashProxyFactory' object has no attribute 'proxy_list'
Traceback (most recent call last):
File "/app/splash/network_manager.py", line 305, in _on_reply_error
self._response_bodies.pop(self._get_request_id(), None)
AttributeError: 'SplashQNetworkAccessManager' object has no attribute '_response_bodies'
QNetworkReplyImplPrivate::error: Internal problem, this method must only be called once.