scrapinghub / splash

Lightweight, scriptable browser as a service with an HTTP API
BSD 3-Clause "New" or "Revised" License
4.09k stars 513 forks source link

Splash freezes with "Timing out client: IPv4Address" #1051

Open demydovb opened 4 years ago

demydovb commented 4 years ago

I am running scrapy-splash for scraping data from one website.

Regularly ( randomly) splash freezes with next logs:

splash-service_1        | 2020-07-16 08:49:35.119333 [-] "172.31.0.4" - - [16/Jul/2020:08:49:34 +0000] "POST /execute HTTP/1.1" 200 266018 "-" "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36"
splash-service_1        | 2020-07-16 08:50:10.012973 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51970)
splash-service_1        | 2020-07-16 08:50:10.858080 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51978)
splash-service_1        | 2020-07-16 08:50:16.873014 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51974)
splash-service_1        | 2020-07-16 08:50:17.547947 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51966)
splash-service_1        | 2020-07-16 08:50:18.037436 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51976)
splash-service_1        | 2020-07-16 08:50:29.064655 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51932)
splash-service_1        | 2020-07-16 08:50:35.119997 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51968)

How can I get the reason of that? Why it might stuck?

P.S I run it with args={"lua_source": self.lua_script_navigate, "timeout":60000}

shivvu commented 3 years ago

@kmike @lopuhin , I would greatly appreciate If you can have a look at this issue. Earlier everything was running fine

Splash Freezes

Here is what i am doing

Running this command on pycharm terminal

docker run -p 8050:8050 scrapinghub/splash This is how logs look like

2021-01-02 17:34:04+0000 [-] Log opened.
2021-01-02 17:34:04.936818 [-] Xvfb is started: ['Xvfb', ':1699939641', '-screen', '0', '1024x768x24', '-nolisten', 'tcp']
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-splash'
2021-01-02 17:34:05.082410 [-] Splash version: 3.5
2021-01-02 17:34:05.145423 [-] Qt 5.14.1, PyQt 5.14.2, WebKit 602.1, Chromium 77.0.3865.129, sip 4.19.22, Twisted 19.7.0, Lua 5.2
2021-01-02 17:34:05.145731 [-] Python 3.6.9 (default, Jul 17 2020, 12:50:27) [GCC 8.4.0]
2021-01-02 17:34:05.145880 [-] Open files limit: 1048576
2021-01-02 17:34:05.146005 [-] Can't bump open files limit
2021-01-02 17:34:05.173122 [-] proxy profiles support is enabled, proxy profiles path: /etc/splash/proxy-profiles
2021-01-02 17:34:05.173369 [-] memory cache: enabled, private mode: enabled, js cross-domain access: disabled
2021-01-02 17:34:05.328431 [-] verbosity=1, slots=20, argument_cache_max_entries=500, max-timeout=90.0
2021-01-02 17:34:05.328837 [-] Web UI: enabled, Lua: enabled (sandbox: enabled), Webkit: enabled, Chromium: enabled
2021-01-02 17:34:05.329278 [-] Site starting on 8050
2021-01-02 17:34:05.329561 [-] Starting factory <twisted.web.server.Site object at 0x7fc45956d5c0>
2021-01-02 17:34:05.329998 [-] Server listening on http://0.0.0.0:8050

it freezes right here and once I click on the local address splash launch on chrome tab and then generate the logs mentioned below

2021-01-02 17:34:06.614506 [-] "172.17.0.1" - - [02/Jan/2021:17:34:06 +0000] "GET / HTTP/1.1" 200 7675 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
2021-01-02 17:35:06.612466 [-] Timing out client: IPv4Address(type='TCP', host='172.17.0.1', port=62420)
2021-01-02 17:35:06.615382 [-] Timing out client: IPv4Address(type='TCP', host='172.17.0.1', port=62422)

Now again freezes.

Can you suggest the solution?

shivvu commented 3 years ago

I am running scrapy-splash for scraping data from one website.

Regularly ( randomly) splash freezes with next logs:

�[36msplash-service_1        |�[0m 2020-07-16 08:49:35.119333 [-] "172.31.0.4" - - [16/Jul/2020:08:49:34 +0000] "POST /execute HTTP/1.1" 200 266018 "-" "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36"
�[36msplash-service_1        |�[0m 2020-07-16 08:50:10.012973 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51970)
�[36msplash-service_1        |�[0m 2020-07-16 08:50:10.858080 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51978)
�[36msplash-service_1        |�[0m 2020-07-16 08:50:16.873014 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51974)
�[36msplash-service_1        |�[0m 2020-07-16 08:50:17.547947 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51966)
�[36msplash-service_1        |�[0m 2020-07-16 08:50:18.037436 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51976)
�[36msplash-service_1        |�[0m 2020-07-16 08:50:29.064655 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51932)
�[36msplash-service_1        |�[0m 2020-07-16 08:50:35.119997 [-] Timing out client: IPv4Address(type='TCP', host='172.31.0.4', port=51968)

How can I get the reason of that? Why it might stuck?

P.S I run it with args={"lua_source": self.lua_script_navigate, "timeout":60000}

Hi, @demydovb, I have found the solution of this problem. Please, have a look at my problem and let me know if you have same issue. would be happy to help you!

ankit616e6b6974 commented 2 years ago

@shivvu I am facing same issue. It will be great help if you share your solution.

bcastane commented 9 months ago

@ankit616e6b6974 and @shivvu , I am facing the same issue and I haven´t found the solution online. If you solved it, I would appreciate if you could share the answer. Thanks!