Closed bzeller closed 10 years ago
It would be nice to have a ThreadedHttpPluginServer, so the server can scale up when lots of requests come in at the same time.
Threads are the one feature that could make Tufão better than other frameworks, imho. I want to integrate threads support in Tufão, but I want to do it right.
Can you share some design ideas? Then we could iterate until we reach a final proposal.
Here are some notes:
Well i'm still figuring out how the code works but here are some initial ideas:
Maybe we can even make it as easy as:
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
ThreadedHttpPluginServer plugins{"routes.json"};
AsyncHttpServerRequestRouter router{
{QRegularExpression{""}, plugins},
{QRegularExpression{""}, HttpFileServer::handler("public")},
{QRegularExpression{""}, NotFoundHandler::handler()}
};
HttpServer server;
QObject::connect(&server, &HttpServer::requestReady,
&router, &AsyncHttpServerRequestRouter::handleRequest);
server.listen(QHostAddress::Any, 8080);
return a.exec();
}
I already thought about a way to have the user decide if he wants to handle requests threaded or non threaded, however if he chooses threads we may need to do some internal synchronization (for example sessiondata needs to be read by multiple threads at the same time, requests needs to be enqueued and dequeued)
If the user wants the best performance, he wants to use threads and he wants only the minimum level of synchronization.
If the idea is to use transparent thread support, it doesn't even make sense to create an interface. Just put everything in HttpServer and add tuneable options in configure/build time.
A Pool of already running threads should be the best choice, the user can decide how big the pool is. After a request is finished the thread goes back to sleep. Using QWaitCondition we can control and wakeup the threads. I'm not shure yet if every thread should have its own copy of a request handler or just reenter the handler if needed. The code of the handler needs to be reentrant in either way.
There is no need for a pool. We can use the same design of Monkey HTTP Daemon project. One thread per core (user tunable value), one event loop per thread, multiple handlers per event loop. User handle design and synchronization of his code (using mutexes, multiple copies of data, whatever...).
[code sample]
I like the simplicity, it might be a good start.
Does ThreadedHttpPluginServer
imply that only plugins could be threaded?
If the user wants the best performance, he wants to use threads and he wants only the minimum level of synchronization. Agreed
If the idea is to use transparent thread support, it doesn't even make sense to create an interface. Just put everything in HttpServer and add tuneable options in configure/build time.
But HttpServer just emits a signal when a new request is available, i think the threading code should be outside of HttpServer and just take the incoming signal and dispatch the request to one of the threads.
There is no need for a pool. We can use the same design of Monkey HTTP Daemon project http://monkey-project.com/. One thread per core (user tunable value), one event loop per thread, multiple handlers per event loop. User handle design and synchronization of his code (using mutexes, multiple copies of data, whatever...).
I disagree on that, imagine you have 4 Cores, the user writes code that pulls data from the database or does some other heavy work where the thread is blocked, after only 4 requests like that the server will just not handle requests until one of the threads is free again. Also one core can handle more than one thread easily because of sleep time on systemcalls and stuff like that.
Think about it, there are only a few reasons why you want code to run on the server and not on the client in JS:
I would say every thread has the same handlers as the others, that means every thread can handle all requests, so the dispatcher code does not need to think about that. However a user tuneable size of the pool would make both approaches possible:
threadDispatcher->setPoolSize(QThread::idealThreadCount());
This threads are purely for handling lots of incoming requests and waiting for them to be finished without blocking the main loop, the user should not write code that communicates between those threads. If the user needs threads for whatever reason he needs to start and handle them by himself.
[code sample]
I like the simplicity, it might be a good start.
Does |ThreadedHttpPluginServer| imply that only plugins could be threaded?
No, of course not but atm it seems to be the logical choice. Maybe the HttpServerRequestRouter is a better choice to put the thread handling code? We need a point where we dispatch the requests to the handlers inside the threads, what would you say is the most logical place in your code?
The user of course has to be aware of using threads, so he can make the right design choices. For example he has to be aware of that the handler function needs to be reentrant and could be called multiple times from different threads (same requirement for Java Servlets). That means inside the AbstractHttpServerRequestHandler subclass there should be no member variables used (if we use the same instance for all threads). Also that its not guaranteed that a thread will be called again for the same session, which means only the SessionStorage is the right place to store data between requests..
I see this like an Producer Consumer Problem!
And I thing to solve that we need: HttpServer let as is. When signal was emitted another class will add in a queue the request and increment the semaphore (producer); And the each thread when free try to acquire one request to handler (the consumers);
And another class will handler how many threads are to be alive:
https://en.wikipedia.org/wiki/Producer–consumer_problem
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Em 30/07/2013, às 22:43, Vinícius dos Santos Oliveira notifications@github.com escreveu:
I already thought about a way to have the user decide if he wants to handle requests threaded or non threaded, however if he chooses threads we may need to do some internal synchronization (for example sessiondata needs to be read by multiple threads at the same time, requests needs to be enqueued and dequeued)
If the user wants the best performance, he wants to use threads and he wants only the minimum level of synchronization.
If the idea is to use transparent thread support, it doesn't even make sense to create an interface. Just put everything in HttpServer and add tuneable options in configure/build time.
A Pool of already running threads should be the best choice, the user can decide how big the pool is. After a request is finished the thread goes back to sleep. Using QWaitCondition we can control and wakeup the threads. I'm not shure yet if every thread should have its own copy of a request handler or just reenter the handler if needed. The code of the handler needs to be reentrant in either way.
There is no need for a pool. We can use the same design of Monkey HTTP Daemon project. One thread per core (user tunable value), one event loop per thread, multiple handlers per event loop. User handle design and synchronization of his code (using mutexes, multiple copies of data, whatever...).
[code sample]
I like the simplicity, it might be a good start.
Does ThreadedHttpPluginServer imply that only plugins could be threaded?
— Reply to this email directly or view it on GitHub.
I see this like an Producer Consumer Problem!
And I thing to solve that we need: HttpServer let as is. When signal was emitted another class will add in a queue the request and increment the semaphore (producer); And the each thread when free try to acquire one request to handler (the consumers);
Yes exactly. Sadly QSemaphore does not work that way, you can not increment it, you have to tell it beforehand how much resources are available:
QSemaphore sem(5); // sem.available() == 5
sem.acquire(3); // sem.available() == 2 sem.acquire(2); // sem.available() == 0 sem.release(5); // sem.available() == 5 sem.release(5); // sem.available() == 10
sem.tryAcquire(1); // sem.available() == 9, returns true sem.tryAcquire(250); // sem.available() == 9, returns false
But Producer-Consumer is the right way to go, i would do it like that (Pseudocode):
Somewhere in the Main Loop:
Request* req = httpserver->nextRequest();
mutex.lock();
pendingRequests.append(req);
int requests = pendingRequests.size();
while(requests > 0){
waitCond.wakeOne();
requests--;
}
mutex.unlock();
In the Threads main function:
forever{ mutex.lock(); waitCond.wait(&mutex);
Request *req = pendingRequests.takeFirst();
mutex.unlock();
//handle the request
....
//tell the dispatcher we are done
qApp->postEvent(dispatcher,new RequestFinishedEvent(req));
}
And another class will handler how many threads are to be alive:
- creating one when queue are long;
- destroying one when takes to much time to be used (idle); this need be intelligent to don't destroy and create many times Or create X threads for eternity, but this can imply more resources needed like databases connections, them I prefer the intelligent creation.
Releasing and spawning threads can be expensive, but if you can come up with a intelligent algorithm, why not. It needs to spawn threads on high loads but not free them before the high load is gone for some time. But imho we can implement this when the basic threading code works, first we should get the basic things to work.
I also thought about things like Database Connections, creating them really can be expensive too.
https://en.wikipedia.org/wiki/Producer–consumer_problem
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Em 30/07/2013, às 22:43, Vinícius dos Santos Oliveira notifications@github.com escreveu:
I already thought about a way to have the user decide if he wants to handle requests threaded or non threaded, however if he chooses threads we may need to do some internal synchronization (for example sessiondata needs to be read by multiple threads at the same time, requests needs to be enqueued and dequeued)
If the user wants the best performance, he wants to use threads and he wants only the minimum level of synchronization.
If the idea is to use transparent thread support, it doesn't even make sense to create an interface. Just put everything in HttpServer and add tuneable options in configure/build time.
A Pool of already running threads should be the best choice, the user can decide how big the pool is. After a request is finished the thread goes back to sleep. Using QWaitCondition we can control and wakeup the threads. I'm not shure yet if every thread should have its own copy of a request handler or just reenter the handler if needed. The code of the handler needs to be reentrant in either way.
There is no need for a pool. We can use the same design of Monkey HTTP Daemon project. One thread per core (user tunable value), one event loop per thread, multiple handlers per event loop. User handle design and synchronization of his code (using mutexes, multiple copies of data, whatever...).
[code sample]
I like the simplicity, it might be a good start.
Does ThreadedHttpPluginServer imply that only plugins could be threaded?
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/vinipsmaker/tufao/issues/15#issuecomment-21865558.
I disagree on that, imagine you have 4 Cores, the user writes code that pulls data from the database or does some other heavy work where the thread is blocked, after only 4 requests like that the server will just not handle requests until one of the threads is free again. Also one core can handle more than one thread easily because of sleep time on systemcalls and stuff like that.
Think about it, there are only a few reasons why you want code to run on the server and not on the client in JS:
- You want to store data (Will block on the SQL call or systemcall for a file based datastorage)
- You want to restore data (Will block on the SQL call or systemcall for a file based datastorage)
- Heavy work or calculations I would say every thread has the same handlers as the others, that means every thread can handle all requests, so the dispatcher code does not need to think about that.
Well, database connections are usually sockets and can be made async, but I see your point. Qt doesn't have a strong support for async operations. The Tufao::HttpFileServer
itself block on system calls. Well, at least some options were discarded and we are close to the right design for Tufão.
This threads are purely for handling lots of incoming requests and waiting for them to be finished without blocking the main loop, the user should not write code that communicates between those threads. If the user needs threads for whatever reason he needs to start and handle them by himself.
There is no need for some code wait for incoming requests to be finished. The handlers receive the {input/request, output/response} pair and the work doesn't need to come back to some main class or alike.
Does |ThreadedHttpPluginServer| imply that only plugins could be threaded?
No, of course not but atm it seems to be the logical choice. Maybe the HttpServerRequestRouter is a better choice to put the thread handling code?
This change would propagate threads support to everything Tufão offers. It's a good option. I need to think more about it.
We need a point where we dispatch the requests to the handlers inside the threads, what would you say is the most logical place in your code?
If we dispatch the requests, we'll have a lot of problems to move objects between threads and transfer of ownership and responsibility among Tufão objects. If we don't intend to transfer ownership, then we'd need to give resources back to a central place or alike.
HttpServer has the problem of being intrusive. You call listen
and it get the full port to itself. No way to pass socket descriptors to him. Maybe it'd be better to create a middleware-like worker that accepts sockets descriptors and provide the HttpServer-like interface. Maybe change HttpServer to accept socket descriptors and just ignore listen method. The socket descriptors idea is to kill the need for complex communications.
I'll need some time to come up with a design.
Some places to look:
The user of course has to be aware of using threads, so he can make the right design choices. For example he has to be aware of that the handler function needs to be reentrant and could be called multiple times from different threads (same requirement for Java Servlets). That means inside the AbstractHttpServerRequestHandler subclass there should be no member variables used (if we use the same instance for all threads).
Maybe we can use a factory of handlers, then each thread would have its own handlers.
Also that its not guaranteed that a thread will be called again for the same session, which means only the SessionStorage is the right place to store data between requests..
This is a good concern and we should put it in the documentation when we are done with the implementation.
On 31.07.2013 20:16, Vinícius dos Santos Oliveira wrote:
I disagree on that, imagine you have 4 Cores, the user writes code that pulls data from the database or does some other heavy work where the thread is blocked, after only 4 requests like that the server will just not handle requests until one of the threads is free again. Also one core can handle more than one thread easily because of sleep time on systemcalls and stuff like that. Think about it, there are only a few reasons why you want code to run on the server and not on the client in JS: * You want to store data (Will block on the SQL call or systemcall for a file based datastorage) * You want to restore data (Will block on the SQL call or systemcall for a file based datastorage) * Heavy work or calculations I would say every thread has the same handlers as the others, that means every thread can handle all requests, so the dispatcher code does not need to think about that.
Well, database connections are usually sockets and can be made async, but I see your point. Qt doesn't have a strong support for async operations. The |Tufao::HttpFileServer| itself block on system calls. Well, at least some options were discarded and we are close to the right design for Tufão.
Yeah sadly Qt lacks async database support. Anyway heavy computing still would be an issue.
This threads are purely for handling lots of incoming requests and waiting for them to be finished without blocking the main loop, the user should not write code that communicates between those threads. If the user needs threads for whatever reason he needs to start and handle them by himself.
There is no need for some code wait for incoming requests to be finished. The handlers receive the {input/request, output/response} pair and the work doesn't need to come back to some main class or alike.
True but what with the chain of handlers? They need to know if a handler really handled the request or not.
Does |ThreadedHttpPluginServer| imply that only plugins could be threaded? No, of course not but atm it seems to be the logical choice. Maybe the HttpServerRequestRouter is a better choice to put the thread handling code?
This change would propagate threads support to everything Tufão offers. It's a good option. I need to think more about it.
We need a point where we dispatch the requests to the handlers inside the threads, what would you say is the most logical place in your code?
If we dispatch the requests, we'll have a lot of problems to move objects between threads and transfer of ownership and responsibility among Tufão objects. If we don't intend to transfer ownership, then we'd need to give resources back to a central place or alike.
Or we just don't pass objects but give the user a flat object that can contain everything we need:
setBody(); setHeader();
but no access to the sockets and then just write all the stuff to the socket in the main thread. But i think maybe calling moveToThread for our objects is the better way to go. Question is how many QObjects are in a Request object.
HttpServer has the problem of being intrusive. You call |listen| and it get the full port to itself. No way to pass socket descriptors to him. Maybe it'd be better to create a middleware-like worker that accepts sockets descriptors and provide the HttpServer-like interface. Maybe change HttpServer to accept socket descriptors and just ignore listen method. The socket descriptors idea is to kill the need for complex communications.
I'll need some time to come up with a design.
Some places to look:
- Tufão's threads example https://github.com/vinipsmaker/tufao/tree/master/examples/threads-example
Multithreading in Qt presentation, by David Faure https://qtconference.kdab.com/node/23#Debugging_multithreaded
The user of course has to be aware of using threads, so he can make the right design choices. For example he has to be aware of that the handler function needs to be reentrant and could be called multiple times from different threads (same requirement for Java Servlets). That means inside the AbstractHttpServerRequestHandler subclass there should be no member variables used (if we use the same instance for all threads).
Maybe we can use a factory of handlers, then each thread would have its own handlers.
Yeah good idea! This makes it much easier for the user. You can then store request local data inside the handler and don't need to think if another thread uses it. But i think for a plugin based handler that means when plugins change we have to stop all threads to reload them.
Also that its not guaranteed that a thread will be called again for the same session, which means only the SessionStorage is the right place to store data between requests..
This is a good concern and we should put it in the documentation when we are done with the implementation. Same requirement for Java Servlets btw ;).
I already created a copy of the repository, i will have some time on my hands tomorrow and could start to hack something. Atm i think we should create a ThreadedRequestRouter. The router spawns threads and every thread creates all request handlers that are registered to the router. Not shure if that is compatible with the plugin based server but maybe it is with some small changes.
— Reply to this email directly or view it on GitHub https://github.com/vinipsmaker/tufao/issues/15#issuecomment-21883553.
I'm currently looking into what needs to be changed to make it work, i found some things i just wanted to ask back:
5 . The Upgrade handler is problematic, if the client sends a upgrade after the Request was moved to the thread, we have a problem, because the Upgrade code runs in the main thread, but the Request belongs to the worker thread.... Maybe disconnecting the upgrade signal and handling it inside the worker thread could work? Maybe with passing the current upgrade handler with the request to the worker thread? Or implement a thread safe getCurrentUpgradeHandler() function.
I don't have much time available now, but I want contribute, currently with ideas or brainstorming.
Just for reference I have my own httpserver (https://gitorious.org/qthttpserver) Based on Stefan Frings work (http://stefanfrings.de/qtwebapp/index-en.html)
He used an different approach in multi-thread support: The http server receive a socket number and send to the thread pooler to send this to a empty thread or create a new one.
Maybe taking a look you help to decide the best design for Tufão.
In fact, I do an benchmark test between booth codes and Tufão are the winner then I want to use Tufão but for this multi thread is an requirement.
Whatever, I here to contribute with Tufão!
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE. Before printing, think about your responsibility to the ENVIRONMENT.
Em 01/08/2013, às 06:14, bzeller notifications@github.com escreveu:
I'm currently looking into what needs to be changed to make it work, i found some things i just wanted to ask back:
Parent ownership: Currently the Request and Response Object are owned by the HttpServer, when we need to move it, it might be better if the ownership is different: The Request itself should have no owner at all (moveToThread does not allow that) The Response object should maybe be owned by Request (it is directly related to the request obj) All internally used QObjects should be owned by the surrounding object e.g. the Socket should be owned by the Request object instead of the HttpServer. That way they will be moved automatically, because all children of a QObject are moved. All QObjects internally used should be pointers (afaik it is mentioned somewhere that QObject should normally be pointers), like the QTimer object also needs to be owned by the parent to get moved properly. And when the parent gets out of scope (object is deleted), it will try to delete all children which will crash because you can not delete a non pointer object. The Plugin loader does not call unload() explicit, according to the docs the Destructor of QPluginLoader does not automatically unload the lib. I'm not shure yet if we can have a eventloop inside the threads, if we want to use QWaitCondition to handle them then its maybe not possible. Question is do we really need a eventloop. Maybe we can enter and exit the eventloop when we start or finish the thread, i have to think more about that. Copy by ref: Why do you use copy by reference for the request objects? Ok we can get the pointer from them using & to move them to the thread, and then make them references again. Is it just for usability? Or are their other reasons (just curious) — Reply to this email directly or view it on GitHub.
@bzeller:
Good remark.
The important characteristic to keep is: No memory leaks and memory management should be easy to the user.
Yeah, sounds like a bug. I'll fix it later.
Thanks for the report.
@israelins85 has more knowledge about event loops. He did a wizard job to make an easy body parser function. I don't remember where the code is.
C++ cool style. I was using too much Qt-style in Tufão 0.x, but Qt is old and not always right. Pass pointers when bare references are enough is more confusing. You have an explicit intent (I'm giving you a reference, but I'm not transferring ownership, giving guarantees that you can use them later or whatever).
Maybe disconnecting the upgrade signal and handling it inside the worker thread could work?
Handle in thread A and call the handler from thread B? It's okay if the user is aware (documentation), but I think it should be consistent/uniform with whatever we choose to handle requests.
Maybe with passing the current upgrade handler with the request to the worker thread?
Almost same as above. It's safer/easier.
Or implement a thread safe getCurrentUpgradeHandler() function.
I think this solution is complex and we should avoid unless needed. Maybe I'm wrong and I'm not seeing how this would be done.
@israelins85:
I don't have much time available now, but I want contribute, currently with ideas or brainstorming.
Same as me. ^^
He used an different approach in multi-thread support:
The http server receive a socket number and send to the thread pooler to send this to a empty thread or create a new one.
Maybe taking a look you help to decide the best design for Tufão.
It may be a good idea to look this code for inspiration.
In fact, I do an benchmark test between booth codes and Tufão are the winner then I want to use Tufão but for this multi thread is an requirement.
It's difficult to run an accurate, objective and useful benchmark. I was delaying this task until I have all features I wanted.
But it's kind of interesting to have this information now. I myself liked the design of this other project and the major problem was related to license.
Ok i gave it some thought and played around with the code a bit. I maybe have a solution and want to know what you guys think about it.
I think the main thread should just accept the connection and dispatch it, everything else should be done by the thread like @israelins85 suggested. But that needs some refactoring.
First refactor the Http handling code out of HttpServer and HttpsServer and put it inside a HttpConnectionHandler and HttpsConnectionHandler. HttpServer and HttpsServer internally will use that class so we don't break source compatibility to existing solutions.
The threaded solution will look something like that:
int main (int argc, char* argv[])
{
auto intializer = []{
//The connection handler the thread will use to pass the connections to
HttpConnectionHandler* hdl = new HttpConnectionHandler();
//Requests handlers, created on the heap and children of the HttpConnectionHandler
//The thread will clean up the handler and with it everything thats a child of it
HttpPluginServer *plugins = new HttpPluginServer({"routes.json"},hld);
HttpServerRequestRouter *router = new HttpServerRequestRouter({
{QRegularExpression{""}, plugins},
{QRegularExpression{""}, HttpFileServer::handler("public")},
{QRegularExpression{""}, NotFoundHandler::handler()}
},hdl);
QObject::connect(hdl, &HttpConnectionHandler::requestReady,
router, &HttpServerRequestRouter::handleRequest);
return hdl;
};
//The threaded connection dispatcher, it will schedule the threads and forward the
//connections to the different threads, replaces HttpServer for threaded approach.
ThreadedHttpConnectionDispatcher dispatch;
dispatch.setThreadInitializer(initializer);
dispatch.listen(QHostAddress::Any, 8080);
return a.exec();
}
I think this is the safest solution, because we don't need to worry about QObject thread affinity and upgrade handlers. Everything will be in the right thread for a request. The thread will run as long as the socket descriptor is open, once it is closed the thread goes leaves the eventloop and goes back to sleep to pick up the next request.
The threaded solution will look something like that [...]
I liked the interface.
I wanna read more about the internal architecture. Will be there a thread pool like you initially proposed? If so, how will it work?
I may have more time to look into this issue in the weekend and I might be able to propose how HttpsServer could be integrated in this design.
Sorry for the flood, the comment button gone crazy. =(
Well HttpsServer is already integrated into the design, just replace HttpConnetionHandler with HttpsConnectionHandler.
The ConnectionHandlers are like Http and HttpsServer just without the QTcpServer code. The only difference is they will have a:
AbstractHttpConnectionHandler::handleNewConnection(int socketDesc);
function.
Internally there will be used a threadpool for now. Maybe we can introduce a algorithm later that grows and shrinks the pool on demand like @israelins85 suggested. But for now i want to keep it simple and make it work first. The ThreadedHttpConnectionDispatcher will accept the connections, handle the threadpool and forwards the connection handling to the threads. Nothing more, small and needs only a Mutex for controlling the threads (Maybe a per thread Mutex is better, i can not decide yet, i have to try it) The workerthreads sleep on a QWaitCondition, when a new request is available the dispatcher puts it in the queue and calls wakeOne() on the QWaitConidtion. The thread takes out the request and starts to handle it. It will call handleRequest() and after the call enter the eventloop (makes async solutions work) and leaves it when the connection is closed. Once it is finished it goes back to sleep. I think that solution should even make websockets work, the only bad thing with this is, it will use the thread as long as the websocket is open. For really long running connections we could run out of threads, then we need the growing threadpool feature or a different solution (like give the user a way to detach the thread from the pool, so he is in control what happens). Otoh the threaded solution is maybe not first choice for websockets based programming.
I have some time in the next days and maybe can hack out a first solution in my own repos and create a pull request.
In my project every socket are exactly that way:
The problem I see with this is: The "keep alive" optimization is lost, because I can't have 200 or 300 threads connecting to the database.
The prefer stay like that are the sockets are maintained on main thread, and encapsulated in a thread safe way to be passed to threads handler, and stay opened for a while to next request be more fast.
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE. Before printing, think about your responsibility to the ENVIRONMENT.
Em 07/08/2013, às 02:52, Vinícius dos Santos Oliveira notifications@github.com escreveu:
Sorry for the flood, the comment button gone crazy. =(
— Reply to this email directly or view it on GitHub.
@israelins85 Do you have a solution for the Upgrade handler? Or can a upgrade only occur BEFORE a request is
created? The problem with the upgrade handler is, its in the main thread, but our requests are handled in the
workers. If a upgrade happens somewhere in between when its handled that would most likely break.
If its guaranteed that the upgrade handler is only called before the HttpServer emits the requestReady signal we are
fine to go.
Also if we push Requests to the threads we need to use moveToThread().
Another problem: with none of both approaches you can implement something like a server push, where the request is kept open to simulate a socket connection. Same problem for websockets btw. We need a threadsafe way to pause and restart a request when we want to send data to the client.
Design question: One handler instance for ALL threads? Or one handler instance PER thread? One for all: less memory consumption, but requires the user to make the handler threadsafe. One per thread: more memory, but takes a lot of burden from the user to make the handler threadsafe and makes programming easier (less need for mutexes, threadlocal data and stuff like that)
Or can a upgrade only occur BEFORE a request is created?
The upgrade interface exported via HttpServer
creates an HttpServerRequest, but it only exposes this object when a request is ready OR an upgrade is requested. It's kind of before. See here for more detail.
The problem with the upgrade handler is, its in the main thread, but our requests are handled in the workers. If a upgrade happens somewhere in between when its handled that would most likely break. If its guaranteed that the upgrade handler is only called before the HttpServer emits the requestReady signal we are fine to go.
requestReady
OR upgrade
is emitted.
On 07.08.2013 19:58, Vinícius dos Santos Oliveira wrote:
Or can a upgrade only occur BEFORE a request is created?
The upgrade interface exported via |HttpServer| creates an HttpServerRequest, but it only exposes this object when a request is ready OR an upgrade is requested. It's kind of /before/. See here https://github.com/vinipsmaker/tufao/blob/master/src/httpserver.cpp#L100 for more detail. OK that means upgrade and request handling are totally different things. IF a upgrade happens its BEFORE we pass the request to the handlers. That means we can forget about the request handler being threadsafe.
The problem with the upgrade handler is, its in the main thread, but our requests are handled in the workers. If a upgrade happens somewhere in between when its handled that would most likely break. If its guaranteed that the upgrade handler is only called before the HttpServer emits the requestReady signal we are fine to go.
|requestReady| OR |upgrade| is emitted. Means a Request that is emitted by upgrade will never emit requestReady. Good to know!
— Reply to this email directly or view it on GitHub https://github.com/vinipsmaker/tufao/issues/15#issuecomment-22270592.
Means a Request that is emitted by upgrade will never emit requestReady. Good to know!
One last thing: The event sequence requestReady
, requestReady
and upgrade
for the same connection/HttpServerRequest object is possible. But it's true that after an upgrade, no more requestReady will be emitted.
Lets say, we use a approach where we push the Request objects to the threads (not the socketdescriptors), and wait until end() is emitted, then push the Request object back to the main thread. That means we can still use the keep-alive mechanism.
However this will change the code a bit:
int main (int argc, char* argv[])
{
//thread initializer now returns a RequestHandler
auto intializer = []{
//Requests handlers, created on the heap and children of the returned RequestHandler
//The thread will clean up the handler and with it everything thats a child of it
HttpPluginServer *plugins = new HttpPluginServer({"routes.json"});
HttpServerRequestRouter *router = new HttpServerRequestRouter({
{QRegularExpression{""}, plugins},
{QRegularExpression{""}, HttpFileServer::handler("public")},
{QRegularExpression{""}, NotFoundHandler::handler()}
});
plugins->setParent(router);
return router;
};
HttpServer server;
//just a normal AbstractHttpServerRequestHandler
ThreadedHttpRequestDispatcher dispatch;
dispatch.setThreadInitializer(initializer);
QObject::connect(&server, &HttpServer::requestReady,
&dispatch,&ThreadedHttpRequestDispatcher::handleRequest);
server.listen(QHostAddress::Any, 8080);
return a.exec();
}
This solutions gives the user the possibility to handle some requests with threads and some not (like requests that would open a server push channel)
Do you have a example for a Http-Server-Push mechanism implemented with tufao? This works a bit different because we have long running requests. But i think for something like that using threads to handle those requests is wrong. This requests just wait until events are available to be send to the client. Imho the user could have a own thread for this connections and handle it differently.
I guess that are the best way!
Previously you comment about one handler per thread or a handler per thread.
Make an assumption: If user is using Tufão threaded they known how thread works!
Them a one handler is better to user less memory. A Handle is not a class to have attributes. I see 3 types of attributes on Tufão handler: 1- Global; 2- Per request; 3- Per thread;
1- If user whats they have to known how protect attributes between threads simultaneous access; 2 - Is Session data (cookies etc) have to be handled by a SessionStore class, that I have done in my HttpServer; 3- User HAVE to know how threads works, if they don't know they will continue using Tufão as now;
To put fire in that BrainStorm Another way is used by PostgreSQL Daemon and Apache: They have an instance of process to each request. I don't know how works in under level. But I guess the Listener program receives the request and create/re-use a process. That way the handler and listener process are independent, if a handler aborts, just one request are lost;
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE. Before printing, think about your responsibility to the ENVIRONMENT.
Em 07/08/2013, às 15:44, Vinícius dos Santos Oliveira notifications@github.com escreveu:
Means a Request that is emitted by upgrade will never emit requestReady. Good to know!
One last thing: The event sequence requestReady, requestReady and upgrade for the same connection/HttpServerRequest object is possible. But it's true that after an upgrade, no more requestReady will be emitted.
— Reply to this email directly or view it on GitHub.
I though about that one handler for all thing: There comes one big problem into mind: Thread affinity of QObjects! If the user want to do asynchronous request handling he is doomed. And lots of our handler are already QObjects, like HttpServerRequestRouter or HttpPluginServer. Or imagine signal/slots, if the user uses some sort of signals on his objects and the QObject lives in another thread. That would require event processing and Qt::QueuedConnection. Also the user can not have children of the handler object, because of different thread affinity. That sounds like a bit too much to think about for the user, i think handler per thread is better. Its not that much more memory and even on an embedded platform that should be no problem: just use less threads in the pool. If we WANT to do it differently we have to remove the QObject from the handlers.
Using processes would work too, but there are some problems with that:
Otoh it would also solve some problems for us
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE. Before printing, think about your responsibility to the ENVIRONMENT.
Em 08/08/2013, às 04:30, bzeller notifications@github.com escreveu:
Using processes would work too, but there are some problems with that:
Connection to the client: On Unix you can forward a socket to another process using some tricks. No idea if that works on other platforms too. But it has to be implemented for every platform
- May be we can leave the socket in main program and just write and read using QProcess to pass the data in, out. You need a way to communicate with the new process and have to move all the data to it
- Asked above; Every new process costs more memory than a thread, and its slower to spawn a new process
- That's true, in previously mail I say about less memory using only one handler, but maybe doing this will be better; It would maybe require lots of refactoring (that we need to check)
- Yes, that's true, the problem doing this is, we need stay tufão working as now, for users that don't need this complexity,. Maybe having 3 ways to works: 1- One process => one thread; 2- One process => multi thread; 3- Multi-process;
There be amazing have the 3 ways justing changing classes used!!!
Otoh it would also solve some problems for us
The HttpPluginServer would correctly work (unloading and loading plugins) The user can not access data from other requests (different process , different memory pages) If one process crashes the others still work, the server does not go down Maybe more — Reply to this email directly or view it on GitHub.
Connection to the client: On Unix you can forward a socket to another process using some tricks. No idea if that works on other platforms too. But it has to be implemented for every platform
- May be we can leave the socket in main program and just write and read using QProcess to pass the data in, out. Would work yes, but that will put more load on the main loop that should just handle the processes.
Every new process costs more memory than a thread, and its slower to spawn a new process
- That's true, in previously mail I say about less memory using only one handler, but maybe doing this will be better; When it comes to crash safety the current code has the same problem if one request crashes the server everything is gone.
It would maybe require lots of refactoring (that we need to check)
- Yes, that's true, the problem doing this is, we need stay tufão working as now, for users that don't need this complexity,. Maybe having 3 ways to works: 1- One process => one thread; 2- One process => multi thread; 3- Multi-process; There be amazing have the 3 ways justing changing classes used!!!
Multi-process can not be done by just changing classes. The user needs to create executables for the handler processes. Thats much more complexity than just changing classes. Maybe that could be solved with loading plugins but still its completely different from the current situation.
I think multi process would require a complete server solution, not just a library you link to your application. One would just have to write plugins and configure the server correctly to load them and call them on the right path. I like the safety the process solution could bring to tufao. But remember things like shared session classes do not work then, or also need to be handled between the processes.
In multi-tprocess we can provide a template server application ready for the Tufão user, and they will need just make the Handle processes, like plugins are, configurable by a conf file.
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE. Before printing, think about your responsibility to the ENVIRONMENT.
Em 08/08/2013, às 12:34, bzeller notifications@github.com escreveu:
Connection to the client: On Unix you can forward a socket to another process using some tricks. No idea if that works on other platforms too. But it has to be implemented for every platform
- May be we can leave the socket in main program and just write and read using QProcess to pass the data in, out. Would work yes, but that will put more load on the main loop that should just handle the processes.
Every new process costs more memory than a thread, and its slower to spawn a new process
- That's true, in previously mail I say about less memory using only one handler, but maybe doing this will be better; When it comes to crash safety the current code has the same problem if one request crashes the server everything is gone.
It would maybe require lots of refactoring (that we need to check)
- Yes, that's true, the problem doing this is, we need stay tufão working as now, for users that don't need this complexity,. Maybe having 3 ways to works: 1- One process => one thread; 2- One process => multi thread; 3- Multi-process; There be amazing have the 3 ways justing changing classes used!!!
Multi-process can not be done by just changing classes. The user needs to create executables for the handler processes. Thats much more complexity than just changing classes. Maybe that could be solved with loading plugins but still its completely different from the current situation.
I think multi process would require a complete server solution, not just a library you link to your application. One would just have to write plugins and configure the server correctly to load them and call them on the right path. I like the safety the process solution could bring to tufao. But remember things like shared session classes do not work then, or also need to be handled between the processes.
I know that, I just saying the 3 approaches can live together just making this a decision to the Tufão user.
— Reply to this email directly or view it on GitHub.
Lets say, we use a approach where we push the Request objects to the threads (not the socketdescriptors), and wait until end() is emitted, then push the Request object back to the main thread. That means we can still use the keep-alive mechanism.
It sounds like the simplest solution we had so far.
Do you have a example for a Http-Server-Push mechanism implemented with tufao?
What do you mean by Http-Server-Push example?
I though about that one handler for all thing:
There comes one big problem into mind: Thread affinity of QObjects!
[...]
That sounds like a bit too much to think about for the user, i think handler per thread is better. Its not that much more memory and even on an embedded platform that should be no problem: just use less threads in the pool. If we WANT to do it differently we have to remove the QObject from the handlers.
Agreed.
On 09.08.2013 20:11, Vinícius dos Santos Oliveira wrote:
Lets say, we use a approach where we push the Request objects to the threads (not the socketdescriptors), and wait until end() is emitted, then push the Request object back to the main thread. That means we can still use the keep-alive mechanism.
It sounds like the simplest solution we had so far.
Seems like we have a way where we all more or less agree:
-> We push the requests to the thread, when the request is done, its pushed back to the mainthread and can be reused -> We provide a factory function that creates the main handler for every thread, as in the example i had before -> We want as much compatibility as possible for the thread stuff it would be nice if the user just has to change to a factory function and it works -> Upgrades are handled in the main thread, if the user wants websockets he also can have a own thread to handle them on himself. Thats not the job of a http request handler
Did i forget anything?
Only problem i see with this is plugin loading, because we can only unload a plugin when ALL plugin loader instances called unload. But first things first, once we get the plain handlers to work we will find a solution for the plugins too.
I already have some proof of concept code here, it needs some refactoring, but maybe i can hack something in the next few days, if you guys agree with my points.
Did i forget anything?
I don't think so.
I already have some proof of concept code here, it needs some refactoring, but maybe i can hack something in the next few days, if you guys agree with my points.
Everything is fine for me. =)
Agreed too!
Enviado via iPhone
Em 09/08/2013, às 19:27, Vinícius dos Santos Oliveira notifications@github.com escreveu:
Did i forget anything?
I don't think so.
I already have some proof of concept code here, it needs some refactoring, but maybe i can hack something in the next few days, if you guys agree with my points.
Everything is fine for me. =)
— Reply to this email directly or view it on GitHub.
Ok guys, i have a first version in my tufao repository, if you have some time you can check it out. There is also a tufao-thread-test repository with some testcode you can use.
Sometimes the server stops to handle requests, i have to investigate that when i have some more time.
Tell me what you think!
I found the problem with the server stopping to handle requests. Its fully functional now afaik. Now some other things need to be made threadsafe, like the sessionstore. Any other things that use shared backends?
Ok guys, i have a first version in my tufao repository, if you have some time you can check it out.
Can you test if HTTP Upgrade (e.g. WebSocket) work? You changed socket parent to HttpServerRequest, but the connect can live longer than the request. When an upgrade occurs, the HttpServer delete the HttpServerRequest.
There is also a tufao-thread-test repository with some testcode you can use.
Interface is great and I like it.
I think these two lines could be removed.
I'll review the rest of the code after these items are resolved.
Yeah the HTTP Upgrade would have failed because the Request object would have deleted the socket. However i think i solved the problem with reparenting the Socket object to 0 (alternatively we could use the HttpServer object, but i think the user should be in charge of the socket after a request).
For example in WebSocket::startServerHandshake , the WebSocket should take parentship of the Socket, so the user can moveToThread on the WebSocket. In fact we should check if every internal used QObject is correctly parented to the wrapping QObject. Otherwise using threads could create ugly bugs that are hard to debug...
I removed the two lines in the testcode, that was just for testing a bug because in HttpServerRequestRouter::map there was a crash if the internal QVector is empty. I already fixed it in my repository.
Yeah the HTTP Upgrade would have failed because the Request object would have deleted the socket. However i think i solved the problem with reparenting the Socket object to 0 (alternatively we could use the HttpServer object, but i think the user should be in charge of the socket after a request).
The previous behaviour was HttpServer being the parent. We don't pay nothing to maintain the behaviour consistent with current version (e.g. reparenting to HttpServer).
I noticed that you were putting copyright in my name. Stop doing that, I don't require copyright assignment in Tufão contributions. Just use your name in the copyright notice at the beginning of the files. ;)
I'll review the rest of the code later today.
On 15.08.2013 11:20, Vinícius dos Santos Oliveira wrote:
Yeah the HTTP Upgrade would have failed because the Request object would have deleted the socket. However i think i solved the problem with reparenting the Socket object to 0 (alternatively we could use the HttpServer object, but i think the user should be in charge of the socket after a request).
The previous behaviour was HttpServer being the parent. We don't pay nothing to maintain the behaviour consistent with current version (e.g. reparenting to HttpServer). True for non threaded applications. But as i said that may be a problem with threaded apps if the user does a moveToThread. Anyway i check if the request is still parent, if not i assign one.
I noticed that you were putting copyright in my name. Stop doing that, I don't require copyright assignment in Tufão contributions. Just use your name in the copyright notice at the beginning of the files. ;) Ok i'll change that in the next commit
I'll review the rest of the code later today.
— Reply to this email directly or view it on GitHub https://github.com/vinipsmaker/tufao/issues/15#issuecomment-22692960.
Hey guys, back from my vacation. Did you have some time to check out the code yet?
Ups closing was and accident
Hey guys, back from my vacation.
I hope you enjoyed your vacations. =P
Did you have some time to check out the code yet?
Okay, I'll review it tonight.
Did you have some time to check out the code yet?
Okay, I'll review it tonight.
Too many changes, I'll need more time to review, but I may be able to do it this week.
On 05.09.2013 02:49, Vinícius dos Santos Oliveira wrote:
Did you have some time to check out the code yet? Okay, I'll review it tonight.
Too many changes, I'll need more time to review, but I may be able to do it this week.
No need to rush, i just found a bug but not the cause of it. After the first 5 requests come in something blocks, it just does not accept any more connections until the first 5 are finished, then it continues to work. It really seems QTcpServer does not handle incoming Connections for some reason. I started debugging it but that will take some time, as there is no clear reason for this (mainloop is not blocked).
I added a threaded Pluginserver too, all this stuff seems to work except the connection handling.
— Reply to this email directly or view it on GitHub https://github.com/vinipsmaker/tufao/issues/15#issuecomment-23837037.
Ok that was not a bug, in fact it was a "feature" of my browser. Firefox AND Chromium always have max 6 connections to a single server, no matter how many tabs you have open. Even if you open 20 requests at the same time, they are split up in a bunch of 6 requests and handled bunch by bunch.
So thats not a bug, everything seems to work just fine. I still need to write some docs and i will upload my current testproject this evening, so you can have a look at how it works.
Hi guys, I came here just to inform you that this is the last week of GSoC and I'll have time to review the code (and code together) after that.
Wish me luck. =)
Hey Vinícius,
good luck ;).
Just wanted to inform you that we hit a ugly bug in the Qt Eventloop (thats our current assumption), with sockets and moveToThread(). Until now Thiago tries to debug it but i did not hear from him since last week. No idea if he will be able to fix the bug soon, thats why i started a new branch (fd-based), where i implement the threading stuff with socket descriptors instead of QTcpSocket. I had to do a lot of refactoring but i think i managed it too keep source compatibility so far. Nothing useable in that branch yet but i will inform you as soon as there is code to test.
Ben
On 16.09.2013 18:37, Vinícius dos Santos Oliveira wrote:
Hi guys, I came here just to inform you that this is the last week of GSoC and I'll have time to review the code (and code together) after that.
Wish me luck. =)
— Reply to this email directly or view it on GitHub https://github.com/vinipsmaker/tufao/issues/15#issuecomment-24524385.
Hi, guys.
I use QTcpSocket between threads with no problems, in many projects. Then you can explain the problem?
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE. Before printing, think about your responsibility to the ENVIRONMENT.
Em 16/09/2013, às 13:37, Vinícius dos Santos Oliveira notifications@github.com escreveu:
Hi guys, I came here just to inform you that this is the last week of GSoC and I'll have time to review the code (and code together) after that.
Wish me luck. =)
— Reply to this email directly or view it on GitHub.
I doubt you hit the problem, it only occurs on very high load in special circumstances (remote host needs to close the connection). I stumbled over it when i was stress testing the tufao code with autobench/httperf.
Sometimes the main thread and the workerthread both get a close event for the same socket. Even though the socket is properly moved to the workerthread. The thread that received the event first, deletes the socket (because we connect the disconnected signal with deleteLater). The other thread then tries to access freed memory which will result in a segfault.
That means you will run into that bug only when: 1) you are moving sockets between threads 2) you connect your disconnected signal to deleteLater, or delete the socket when the signal comes in. 3) you have high load (not sure about that one) 4) The remote host closes the connection in the right second
On 16.09.2013 21:35, israelins85 wrote:
Hi, guys.
I use QTcpSocket between threads with no problems, in many projects. Then you can explain the problem?
Regards,
Israel Lins Albuquerque
Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE. Before printing, think about your responsibility to the ENVIRONMENT.
Em 16/09/2013, às 13:37, Vinícius dos Santos Oliveira notifications@github.com escreveu:
Hi guys, I came here just to inform you that this is the last week of GSoC and I'll have time to review the code (and code together) after that.
Wish me luck. =)
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/vinipsmaker/tufao/issues/15#issuecomment-24537866.
Ok guys, brief update.
I did a rewrite to make the code use socketdescriptors instead of QTcpSockets. This is how it works:
A new connection comes in and the socket descriptor is directly forwarded to a worker thread. Inside of that worker script the Socket is created and handled just like before. The request now belongs to that thread as long as the socket does not close it or it times out. I implemented a mechanism that allows me to have more than one request per thread, its a timer that measures the work load of the thread, if the timeout is arriving too slow the workload of the thread is high.
The request dispatcher does a round robin dispatching but also uses the workload to determine the next thread and prefers completely idle threads. Thats how the request dispatcher works (i'm open for other ideas): -> New connection is coming in -> The dispatcher looks if there is a idle thread (A thread with no attached request) -> If it can not find a idle thread, it iterates over the working threads, and tries to find a thread with a low workload -> If it can not find a thread with a low workload it will use the first one with a medium workload -> If it can not find a thread with low or medium workload it will enqueue the connection and try again later -> the selected thread is then moved to the end of the thread queue, to make the request dispatching more fair
I had to do some refactoring, the biggest change is the new AbstractConnectionHandler class, its basically a HttpServer without the internal QTcpServer. The AbstractConnectionHandler will be used inside the Threads to handle the incoming connections. However i did maintain source compatibility with HttpServer as much as possible, i think that should rarely break something.
If someone wants to try the code checkout the fd-based branches from my tufao and tufao-thread-test repositories.
It would be nice to have a ThreadedHttpPluginServer, so the server can scale up when lots of requests come in at the same time.
A possible problem with this is the handling of the loaded plugins, each thread might need to load and unload them itself.