Max limit new Thread in 312 threads, after error not be started

krakjoe / pthreads

Threading for PHP - Share Nothing, Do Everything :)

Other

3.47k stars 501 forks source link

Max limit new Thread in 312 threads, after error not be started #100

Closed hqvnet closed 11 years ago

hqvnet commented 11 years ago

I have a problem, the max limit in the new thread is 312 threads in linux and windows 184 new threads.

Warning: pthreads has detected that the socket_Thread could not be started, the system lacks the necessary resources or the system-imposed limit would be exceeded.

The max memory usage is 184kb

while($x<350) {

            $thread[$x]=new socket_Thread($this->socket,$x);
            echo "\n Iniciando Threads\n";
             $thread[$x]->start();
            echo "\n Startando Threads ".$x."\n";
            $cont=$thread[$x]->contador;

            $x++;

     }

chibisuke commented 11 years ago

You're not supposed to be running that many threads at any given time anyway. The Problem there is that if you run a lot more threads than you have CPU cores, you gonna end up blocking other threads, and causing more overhead due to the threading that you'd have if you run it sequentially.

Depending on what you're tring to achive you might even run in problems such as lock convoys (Wikipedia).

For the problem itself, I think you're pretty much running out of address space. While a process doesn't nescessary consume the space that is allocated for stack, heap ect. it is still reserved in the address space of the application.Lets assume a default stacksize of 8MB (für most linux implementations, dunno about windows) you're gonna hit that limit quite quickly on x86.

I'd strongly advice you to reduce your total number of threads to about NUM_CPUCORES+2 (x2 if you got hyperthreading enabled). Because thats what according to my experience and the experience of other people yields the best results when it comes to concurrency. That way also you're avoiding problems like Thundering Herd situations, that

You might want to have a look into the Worker and Stackable examples.

krakjoe commented 11 years ago

This belongs firmly in the category of "you can shoot yourself in the foot, if you like" ....

@chibisuke is absolutely correct, you are chasing an unrealistic pattern of execution that could only be detrimental ... lets not forget that your operating system and it's threads do share your hardware, to ask that much of it is unrealistic, to say the least ...

As mentioned, look at workers and stackables, there is nothing I can, or would. do to raise or in any way circumvent reasonable system imposed limitations.

hqvnet commented 11 years ago

I'm developing a webcache server for each new connection runs a new thread that accepts the connection and search the information in the cache. I estimate an average of 30,000 to 50,000 simultaneous connections. What is the correct way to work?

chibisuke commented 11 years ago

the short answer: PHP is not capable of handling that ammount of connections - neither with pthreads nor without.

The long answer: PHPs only supported event management mechanism is select(2). select can only handle a limited ammount of connections and gets increasingly slowed down by a rising number of sockets. (I think it was O(n^2) if I'm not mistaken).

You'd need to use some current event triggering mechanics like epoll or kqueue, which php-sockets doesn't support (yet). So theoretically speaking you'd need to extend the php-sockets component, or develop your own PHP extension, and then use epoll/kqueue to process the data using non-blocking sockets. pthreads can then be used to assist with the CPU performance, by splitting the processing to multiple cores. For this you'd spawn NUM_CPUCORES(+1) worker, and feed them with stackables.

For details on the proplematics that arrise from having multiple 10k connections please refer to this http://www.kegel.com/c10k.html

krakjoe commented 11 years ago

In any language I know of, that wouldn't be the ideal mode of execution, it is very expensive for a single connection to cost a whole thread, as you have found, you have a finite amount of threads available. A more ideal mode of execution, though your scaling may need to be thought through again, would be the worker model. Apache uses a worker model, where a connection does not equate to a thread, you don't want a 1:1 model, you want a worker to be able to handle efficiently say 100 connections at a time, using non blocking sockets/select, I think a reasonable limit even for PHP, maybe more, I am not really a socket programmer. This brings high performance well within the realms of possibility, even in PHP: lets say you can have 100 workers, suddenly you have gone from being able to accept 180 connections or however many max threads you can have to being able to accept 100x100 connections, that's 10,000 connections, write the software to deploy on multiple machines and even with relatively simple PHP what appears impossible might actually be possible ... I am picking these numbers out of thin air to illustrate a point, as I said, I'm not really a socket programmer, but hopefully you understand the error in your method now. Now, I am not suggesting that having 100 threads, attempting to deal with 100 connections each will work, or is a good idea, I'm just trying to illustrate how better to think about making the most of your hardware, which remember has very real limits. I'll be interested to hear how you get on and what solutions you come too ...

olekukonko commented 11 years ago

You should not have issues with 50,000 simultaneous connections even with 100 threads except you are actually referring to Concurrent Connections which in this case i see 4 possibilities

Follow @chibisuke advice which is a very long route would would eventually still run into a lot of issues using writing PHP extension for epoll/kqueue
Follow @krakjoe advice and look at workers and stackables then scale & load balance your server using HAProxy (http://haproxy.1wt.eu/) and commodity hardware
I really doubt if any Web cache developed in PHP can be faster than nginx or varnish ..... just use existing tools , also memcached can come handy.
Look at other languages because higher concurrency can be achieved with even lower CPU and memory usage using Erlang and some people say the have achieve similar results with node.js

chibisuke commented 11 years ago

I forgott to mention one more possibility.

You could simply reduce the number of concurrent connections and move them to a number of sequential connections.

In some operating systems there is a possibility to set a socketopt that allow you to keep most of the HTTP overhead in the kernel instead of in your processlist.

For example FreeBSD offers accf_http which is a filter that can be applied to a listening socket. Your application then only gets a "accept()" when the HTTP header is completed. That way you could start processing the request immediately in a worker and close it afterwards. No keeping the connection open before or after processing it, and the listen backlog (which on FreeBSD can be set to -1 (infinite) )is your friend aswell.

Of course this is only an option if you're willing to bind your application to an OS like FreeBSD and can accept that it will definitly not work that well (in terms of scalability) on linux/windows.