manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
9.04k stars 507 forks source link

Solved - FATAL: failed to create rotation thread #126

Closed sshilko closed 6 years ago

sshilko commented 6 years ago

Are you in the right place?

For general questions, post a question to Forum or Slack

If you open a Github issue, it must be a bug or feature request and you need to provide the below information.

Describe the environment

Amazon Linux 2

Manticore Search version: 2.7.3

OS version: Linux ip-172-30-0-217.ec2.internal 4.14.62-70.117.amzn2.x86_64 #1 SMP Fri Aug 10 20:14:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Describe the problem

When using workers = thread_pool getting this error, with workers = threads everything is fine. Cant reproduce locally, only on AWS EC2 inside docker. There are no other containers, no docker settings make any difference (ports, memory, host network)

Description of the issue:

Steps to reproduce:

Messsages from log files: FATAL: failed to create rotation thread

manticoresearch commented 6 years ago

Hi. Are you using our docker image and the default config or smth else?

sshilko commented 6 years ago

I'am using your build image for building, except its alpine-3.7 instead of 3.6, then take binaries and move into my alpine + config yes, i'am in slack now if we can chat

sshilko commented 6 years ago
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: Manticore 2.7.3 ad9b22d0@180928 release
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: Copyright (c) 2001-2016, Andrew Aksyonoff
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: Copyright (c) 2017-2018, Manticore Software LTD (http://manticoresearch.com)
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: using config file '/etc/sphinxsearch/sphinx.conf'...
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: WARNING: TCP fast open unavailable (can't read /proc/sys/net/ipv4/tcp_fastopen, unsuppo
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: listening on all interfaces, port=9312
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: listening on all interfaces, port=9212
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: precaching index 'stream_test_0'
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: Locking the index via file /var/lib/sphinx/stream_test_0.spl
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: lock /var/lib/sphinx/stream_test_0.spl success
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: precaching index 'stream_test_1'
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: Locking the index via file /var/lib/sphinx/stream_test_1.spl
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: lock /var/lib/sphinx/stream_test_1.spl success
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: precaching index 'stream_test_2'
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: crond: crond (busybox 1.27.2) started, log level 6
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: Locking the index via file /var/lib/sphinx/stream_test_2.spl
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: lock /var/lib/sphinx/stream_test_2.spl success
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: precaching index 'stream_test_3'
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: Locking the index via file /var/lib/sphinx/stream_test_3.spl
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: lock /var/lib/sphinx/stream_test_3.spl success
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: precached 4 indexes in 0.007 sec
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: 'binlog_max_log_size' - nothing specified, using default value 268435456
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: 'read_unhinted' - nothing specified, using default value 0
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: DEBUG: 'read_buffer' - nothing specified, using default value 0
Oct 30 15:59:07 ip-172-30-0-217.ec2.internal docker[10879]: FATAL: failed to create rotation thread
sshilko commented 6 years ago

Cured by

vm.overcommit_memory=1 instead of vm.overcommit_memory=2

https://www.kernel.org/doc/Documentation/vm/overcommit-accounting

P.S. Carefull with memory usage, even if it says 90% free - sphinx tries to allocate alot on start i guess. Error message could be improved tho.

tomatolog commented 6 years ago

What AWS instance type do you use? How much memory it got? What is you config and indexes size?

sshilko commented 6 years ago

AWS t2.small - VPC

              total        used        free      shared  buff/cache   available
Mem:           1993         218        1304          69         470        1418
Swap:             0           0           0

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
1e738da8XXXX        4.32%               144.6MiB / 1.947GiB   7.25%               0B / 0B             12.5MB / 71.2MB     4107

And this is beta - uses 144MB while having 0% real load (not production), the production with million rows uses 11.29MiB (docker ps) only under heavy load. (4 cpu 8gb ram ~50% load)

Changing workers from thread_pool to threads reduces memory usage to previous levels

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
53948e4ca4a8        2.19%               4.727MiB / 1.947GiB   0.24%               0B / 0B             32.8kB / 12.3kB     12
sshilko commented 6 years ago

workers = thread_pool = 4105 pids, workers = threads = ~10 pids too

sshilko commented 6 years ago

Reproducing memory issue

workers=threads sysctl vm.overcommit_memory=2

[root@ip-172-30-0-217 ec2-user]# free -m
              total        used        free      shared  buff/cache   available
Mem:           1993         131        1444          69         417        1554

indexes list

/var/lib/sphinx $ ls -lhS
total 248
-rw-r--r--    1 dalek    dalek       2.0K Oct 31 10:06 stream_test_0.sph
-rw-r--r--    1 dalek    dalek       2.0K Oct 31 10:06 stream_test_1.sph
-rw-r--r--    1 dalek    dalek       2.0K Oct 31 10:06 stream_test_2.sph
-rw-r--r--    1 dalek    dalek       2.0K Oct 31 10:06 stream_test_3.sph
-rw-r--r--    1 dalek    dalek       2.0K Oct 31 10:06 stream_test_delta_0.sph
-rw-r--r--    1 dalek    dalek       2.0K Oct 31 10:06 stream_test_delta_1.sph
-rw-r--r--    1 dalek    dalek       2.0K Oct 31 10:06 stream_test_delta_2.sph
-rw-r--r--    1 dalek    dalek       2.0K Oct 31 10:06 stream_test_delta_3.sph
-rw-r--r--    1 dalek    dalek       1.1K Oct 31 10:06 stream_test_0.spa
-rw-r--r--    1 dalek    dalek        968 Oct 31 10:06 stream_test_1.spa
-rw-r--r--    1 dalek    dalek        968 Oct 31 10:06 stream_test_2.spa
-rw-r--r--    1 dalek    dalek        834 Oct 31 10:06 stream_test_3.sps
-rw-r--r--    1 dalek    dalek        792 Oct 31 10:06 stream_test_3.spa
-rw-r--r--    1 dalek    dalek        773 Oct 31 10:06 stream_test_2.sps
-rw-r--r--    1 dalek    dalek        754 Oct 31 10:06 stream_test_0.sps
-rw-r--r--    1 dalek    dalek        720 Oct 31 10:06 stream_test_0.spm
-rw-r--r--    1 dalek    dalek        692 Oct 31 10:06 stream_test_2.spm
-rw-r--r--    1 dalek    dalek        616 Oct 31 10:06 stream_test_delta_2.spa
-rw-r--r--    1 dalek    dalek        588 Oct 31 10:06 stream_test_1.spm

              total        used        free      shared  buff/cache   available
Mem:           1993         129        1446          69         417        1556

Everything started OK

Now changing to workers=thread_pool sysctl vm.overcommit_memory=2

Immediately fails (while having 1.4Gb free) with indexes taking few megs


              total        used        free      shared  buff/cache   available
Mem:           1993         120        1454          68         418        1565

Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: Manticore 2.7.3 ad9b22d0@180928 release
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: Copyright (c) 2001-2016, Andrew Aksyonoff
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: Copyright (c) 2017-2018, Manticore Software LTD (http://manticoresearch.com)
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: using config file '/etc/sphinxsearch/sphinx.conf'...
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: listening on all interfaces, port=9312
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: listening on all interfaces, port=9212
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: precaching index 'stream_test_0'
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: Locking the index via file /var/lib/sphinx/stream_test_0.spl
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: lock /var/lib/sphinx/stream_test_0.spl success
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: precaching index 'stream_test_1'
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: Locking the index via file /var/lib/sphinx/stream_test_1.spl
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: lock /var/lib/sphinx/stream_test_1.spl success
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: precaching index 'stream_test_2'
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: Locking the index via file /var/lib/sphinx/stream_test_2.spl
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: crond: crond (busybox 1.27.2) started, log level 6
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: lock /var/lib/sphinx/stream_test_2.spl success
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: precaching index 'stream_test_3'
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: Locking the index via file /var/lib/sphinx/stream_test_3.spl
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: lock /var/lib/sphinx/stream_test_3.spl success
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: precached 4 indexes in 0.006 sec
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: 'binlog_max_log_size' - nothing specified, using default value 268435456
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: 'read_unhinted' - nothing specified, using default value 0
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: DEBUG: 'read_buffer' - nothing specified, using default value 0
Oct 31 10:12:00 ip-172-30-0-217.ec2.internal docker[41316]: FATAL: failed to create rotation thread
Oct 31 10:12:01 ip-172-30-0-217.ec2.internal systemd[1]: Failed to start Docker
sshilko commented 6 years ago

There is nothing running inside container except searchd and cron to reindex. I suspect max_children=4096 might be related to 4107 pids running

tomatolog commented 6 years ago

max_children for thread_pool workers set number of workers daemon creates at pool of threads. You should set there values related to your box or comment out that option then daemon auto-calc its value.

sshilko commented 6 years ago

main docker (4k pids, not connected directly by clients)

...
    max_children        = 4096
...

    index stream_test
    {
        type = distributed
        local = stream_test_0
        local = stream_test_1
        local = stream_test_2
        local = stream_test_3

    }

relay docker (all clients connect via this instance, its actually on same 127.0.0.1 as clients), 172.30.0.217 is main docker

...
    max_children        = 4096
...
    index stream_test
    {
        type = distributed
        agent_connect_timeout = 1500
        agent_query_timeout   = 3000
        agent_persistent = 172.30.0.217:9312|172.30.0.217:9212:stream_test_0
        agent_persistent = 172.30.0.217:9312|172.30.0.217:9212:stream_test_1
        agent_persistent = 172.30.0.217:9312|172.30.0.217:9212:stream_test_2
        agent_persistent = 172.30.0.217:9312|172.30.0.217:9212:stream_test_3

    }
sshilko commented 6 years ago

Pids issue is related to max_children

max_children=4096 docker stats PIDS

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
89a37248b868        2.19%               138.4MiB / 1.947GiB   6.94%               0B / 0B             0B / 177kB          4105

max_children=1096 docker stats PIDS

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
34ecf5fac5bf        4.38%               40.46MiB / 1.947GiB   2.03%               0B / 0B             0B / 26.1kB         1107

But when doing ps aux, i dont see any pids running (doing ps aux neither on host neither in container)

sshilko commented 6 years ago

Docker version 17.06.2-ce, build 3dfb8343b139d6342acfd9975d7f1068b5b1c3d3


/ $ ps aux
PID   USER     TIME   COMMAND
    1 dalek      0:03 /usr/bin/searchd --nodetach --index stream_test --index stream_test_0 --index stream_test_1 --index stream_test_2 --index st
   15 root       0:00 crond -l 6 -d 6 -b -L /dev/console
 1292 dalek      0:00 /bin/sh
 1296 dalek      0:00 ps aux
/ $ exit

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
34ecf5fac5bf        0.01%               40.63MiB / 1.947GiB   2.04%               0B / 0B             0B / 165kB          1105
sshilko commented 6 years ago

@tomatolog not specifying max_children fixes the memory issue on start and PIDS maybe should be mentioned in documentation, probably on start it tries to start all max_children threads (runs out of memory with vm_overcommit=2)

sshilko commented 6 years ago

Documentation could be fixed, or maybe there is issue in code - that it tries to create max_children threads on start.

tomatolog commented 6 years ago

for workers thread_pool daemon always creates N threads for thread pool. That N value calculated as

that is why in case you explicitly set that option then thread pool got started with 4096 threads (max_children = 4096), ie daemon creates such thread pool for you.

That is why I see no issue - just comment out that option or set value that fits your box.

You also might create PR with doc fixed for that section https://manticoresearch.gitlab.io/dev/conf_options_reference/searchd_program_configuration_options.html#max-children to make this option clear.

sshilko commented 6 years ago

Created PR https://github.com/manticoresoftware/manticoresearch/pull/127

Thx, closing