cms-dev / cms

Contest Management System
http://cms-dev.github.io/
GNU Affero General Public License v3.0
888 stars 360 forks source link

Rethink the whole IO model under CMS #156

Closed giomasce closed 11 years ago

giomasce commented 11 years ago

The IO model currently used under the hood in CMS is a bit problematic: all services use the asyncore loop, which is now deprecated. Moreover, the web services also use the IOLoop from Tornado, with some machinery to make them working together. Unfortunately SQLAlchemy doesn't integrate with neither of them, so at the moment all database calls are blocking (and this has caused pretty big performance problems at IOI 2012).

Since quite some time we're investigating other products upon which base CMS. More or less no one of them is without downsides, but certainly the current situation can be improved. This issue is to track possibilities and decisions about this topic.

Database caching, proposed by Luca in #106, may help the situation. But a general redesign is probably required anyway.

Proposal proposal

(that is, I'm proposing some elements wich could, at some point, form a real proposal)

I see no way SQLAlchemy can be used in a callback-based environment, because of the very syntax it uses. Moreover, I find it unconfortable to have to create a new callback for (nearly) every line of code which has to query the database. Instead, a coroutine-based environment may help us a lot; at the moment gevent[1] seems to be the likeliest alternative, also because it is easily usable with SQLAlchemy (see [2]).

[1] http://www.gevent.org/ [2] https://bitbucket.org/zzzeek/green_sqla/src/2732bb7ea9d06b9d4a61e8cd587a95148ce2599b/green_sqla/psyco_gevent.py?at=default

Using gevent, probably the only rough edge to be fixed is its integration with Tornado. So far I haven't found anything convincing (there is [3], but it tested it and it doesn't work; moreover, since gevent doesn't emulate epoll() but only select(), it would be pretty slow under high load). Now I'm starting to have a look at Tornado IOLoop's code (which, fortunately, doesn't appear to be particularly complicated) to understand how difficult it would be to integrate with gevent.

[3] http://bmmh.wordpress.com/2012/10/07/tornado-and-gevent/

Other thougths

In the past I also proposed to investigate the possible usage of some standard queue management software, such as RabbitMQ. While I still think that such proposal may be interesting, it's a less important issue and I'm not pushing for it at the moment.

lw commented 11 years ago

I totally agree with you. I add a few notes.

If we want to stick with SQLAlchemy and use it efficiently we need to use a model that's synchronous and blocking, at least at syntax level (instead of asynchronous and callback-based, as we have now). That means we need to use threads. Either "fake" "virtual" cooperative threads (coroutines, greenlets, etc.) or real threads.

I don't see keeping compatibility with Tornado as a priority, because of my personal hatred for it. I wouldn't mind substituting it with something else, even if that would mean rewriting most of CWS and AWS.

For a webserver I would see coroutines more fit, as one can spawn one for each HTTP request, which is an easy and clean way to handle them. Real threads, on the other hand, are quite expensive to create and destroy that frequently, and we would need to use pools, ugly and problematic, for some reasons explained below.

gevent is almost the only available coroutine based framework out there, since eventlet seems a bit unmaintained. Correct me if I'm wrong. I don't know how easy it will be to port our messaging system to it (in case we don't decide to do the switch to something else) but from the webserver point of view it presents one large problem: it's missing any interface to do streaming HTTP requests, server push or WebSockets.

At the moment we're only using these features in RWS, which could keep using Tornado, since it's not using SQLAlchemy. That would create a great inconsistency in our code, and it would also prevent us from implementing something similar in CWS and AWS one day.

To create a seamless consistent API for both "traditional" (request-response) and "streaming" HTTP requests one would have to rewrite almost the entire HTTP implementation of gevent, since the actual ones are using WSGI, a standard interface which, unfortunately, has been designed to only support the former. One could work on adding "aside" support for WebSockets, in a parallel implementation. Downsides: it needs to listen on a different port, which is ugly, inconvenient, and difficult to put in practice in environments like cloud servers.

Yet, using WSGI means it will be easier in the future to switch framework, as long as the new one supports WSGI too.

Conclusions: I don't have any. But seeing the amount of work that a proper implementation of WebSockets requires I tend to be slightly in favor of using WSGI (and renounce to WebSockets, or implement them separately).

giomasce commented 11 years ago

I agree with all your notes, in turn. I comment a bit further.

giomasce commented 11 years ago

Just as a note, I tried to factor all WS (except RWS, of course) though a WSGI interface[1]. Good news: the first very brief tests indicate that everything is working as before. One of the things I didn't test is the presence of a nginx proxy in front the WS.

[1] https://github.com/giomasce/cms/commit/6c717002a08790bb3dcbfc080c9f136b546db399

giomasce commented 11 years ago

I converted AsyncLibrary.py in order to use gevent and, to my surprise, it was much easier than expected making everything work; I setup a gevent branch in my fork. At the moment I converted to the new GeventLibrary the services Checker and EvaluationService, and everything seems to work perfectly.

On the "still-to-do" side we have the integration with *WSs (but it shouldn't be too difficult, given that the Tornado WSGI interface seems to work well, see my last comment). Moreover, threaded RPC is not supported anymore (and, actually, removing it is one of the targets of the gevent conversion), so some more care will be needed for adapting the Worker (also taking into account that gevent supports subprocess only starting from a version more recent than the one currently in Ubuntu). There will be some changes to do in ScoringService too, in order to get rid of that awful double thread architecture that it has now.

But now I'm very happy!! I'll go eating some of the shortbreads I cooked the day before yesterday!

giomasce commented 11 years ago

I did some more work on my gevent branch, including supporting *WSs and actually activating greenlet support to psycopg2. Unfortunately I just discovered that psycopg2 doesn't support large objects (that is, what is behind FileCacher) when in coroutine mode. And this appears to be rather difficult to fix, since the libpq API itself (the library interface to PostgreSQL) doesn't support non-blocking API on large objects calls (don't know why).

This is rather disappointing, actually. At the moment I don't see any easy way around this problem (patching libpq is probably a bit far-fetched for my skills and time). I have to say that I find it strange that the large objects API are so negliged not to have a non-blocking counterpart; and given that I may have misunderstood, I first have to check better my information.

Anyway, if this is confirmed to be the case, at the moment the only way out that I see is not using anymore large objects as FileCacher backend. Which is a bit sad, because I liked the way in which we managed to keep all data in a single place with a single entry point. The alternative I have in mind (and that I already wanted to implement anyway) is to back the FileStorage with a NFS share (any reasonable network file sharing protocol is good anyway).

But I have to think a bit better on all these things before taking decisions.

giomasce commented 11 years ago

BTW, an option is also not the enable coroutine support in psycopg2, but we fall back to an application that blocks on queries. Using gevent may anyway be a bit better than the bad async hacks, but still I think that most of the current performance problems of CMS under stress are due to blocking database API.

An intermediate approach is to search some vary hacky way to make "plain" DB queries with enabled coroutine support and dispatch large objects calls through another "instance" of psycopg2 without coroutine support, with all the tricks we can invent to make this less cumbersome possible.

But, again, only thoughts.

giomasce commented 11 years ago

After some more thinking, I'd say that for the time being the better approach is to modify DBBackend so that all requests are offloaded to another process (created with the multiprocessing module). This second process imports and uses psycopg2 in the synchronous fashion, so has access to large objects functions. They communicate between themselves using a pipe (we may be able to use a multiprocessing.Connection object, which is more or less like a pipe with Python semantics, but I have to understand whether it interacts nicely with gevent).

This, at least if imlemented the naive way, means that only one file a time can be accessed by a Service. It shouldn't be that difficult to make it able to multiplex requests, but I won't do it in the first time. And, of course, it still remains blocking (in its subprocess), otherwise we wouldn't have problems. Notice, though, that file accesses that hit the cache (the great majority, perhaps after the contest first minute) are not concerned by this bottleneck.

Of the course, the long-term real fix would be to fix libpq and psycopg2 to support asynchronous and non-blocking API for large objects. Differently from what I stated yesterday, I might be fool enough to write on the PostgreSQL hackers mailing list to ask for details and, perhaps, start working on it.

giomasce commented 11 years ago

Oh, I was nearly forgetting: apart from large objects, there is still something to do for having real non-blocking CMS, that is checking that filesystem IO doesn't get blocking. In particular, I'm pretty sure we use shutil.copyfile() or shutil.copyfileobj() somewhere: such things have to be patched in order to cooperatively yield every now and then (see for example FileHandler.fetch() in cms/server/__init__.py).

More in general, we have to find all blocking calls in CMS and replace them with cooperating equivalent calls. We could use monkey patching, but I don't like it: it's better to fix CMS, also because many blocking things are not fixed by monkey patching.

giomasce commented 11 years ago

Note for the future: see [1] on how to use a pipe the gevent way to communicate with the child process.

[1] https://github.com/zacharyvoase/gevent-selfpipe/blob/master/lib/gevent_selfpipe/__init__.py

giomasce commented 11 years ago

And of course the new GeventLibrary has to be checked against AsyncLibrary test suite, which in turn has to be updated for the new library.

lw commented 11 years ago

It's great you have so much time to hack on CMS. At the moment I don't, so I'll just leave some random notes...

The "conversion" to WSGI isn't actually complete because we're still depending on a lot of Tornado's stuff (web, template, locale, etc.) even if we probably don't use its httpserver, ioloop, etc. anymore. A proper conversion requires to rewrite AdminWebServer.py and ContestWebServer.py to use a different interface than the one offered by Tornado's RequestHandlers. Since the "pure" WSGI interface is rather cumbersome, I was thinking of using Werkzeug [1], a kind of "wrapper".

As for the large object problem, I also want to keep using them so I dont' consider NFS & Co. an option. The best solution is obviously to add support inside libpq and psycopg. It seems very difficult, because none of us is a libpq developer (yet?) and because we'd have to wait at least until the next stable Ubuntu release for our users to get these new versions and add them as a requirement. I'd say we should nevertheless try to get this done (I mean, at least contact the libpq developers to tell them we need this feature and raise it's priority in their TODO-list). In the meantime we have to look for another (temporary?) solution. I see two options:

The second option has the disadvantage of blocking the async loop of gevent, but it is simpler, perhaps more efficient (no pipes) and easier to convert back when native libpq support is available.

[1] http://werkzeug.pocoo.org/

giomasce commented 11 years ago

Ok about using another WSGI framework. With "conversion" I meant making CMS non blocking. The other improvements are beyond what I'm targetting within this issue.

About the temporary solution, I didn't think about temporarily disabling coroutines; I don't find it easy to estimate its impact, but definitely it is worthy a try. Yet, I think it will be anyway better to implement the side-process solution.

lw commented 11 years ago

It's actually not a problem to keep depending on Tornado for a while, given that we will still need it for RWS. This means that we can switch to gevent using Tornado as a WSGI wrapper and then, slowly, migrate to Werkzeug.

As for the performance, I think that it will clearly be strictly better than what we have now since, instead of blocking on all queries on the database, we block only on the large-objects ones. That is: while waiting for some "normal" query to return gevent will start executing other code and the program will "do something", whereas when waiting for a large-object query psycopg won't return the control to the gevent loop and therefore the process will "block", doing nothing, until the query ends. (Again: at the moment we have the second behavior for every query we run.)

The code snippet above was wrong. Here is what I meant:

f = psycopg2.extensions.get_wait_callback()
psycopg2.extensions.set_wait_callback(None)
# fetch file
psycopg2.extensions.set_wait_callback(f)
giomasce commented 11 years ago

I just pushed on my branch some more work: apart from many fixes, I implemented the strategy Luca suggested and I also reimplemented copyfile and copyfileobj to make them gevent-friendly.

giomasce commented 11 years ago

Another problem that will have to be fixed (and it's possible that werkzeug has tools to do it): at the moment the is_proxy_used feature is broken, because gevent.pywsgi doesn't appear to support it. A TODO has been added to the code.

Moreover, again as documented in a few TODOs in the code, at the moment I'm just modifying CMS as few as possible to make it work with the new gevent paradigm. Now that we have greenlet support, some pieces of code should be rewritten entirely to fit the new model, but I'm not doing it now. I think it's better to diverge as few as possible from master, accept the changes so that everything works and then fix the style.

giomasce commented 11 years ago

I added gevent support to Sandbox, LS and RS. And discovered that, apparently gevent.subprocess implementation still has some rough edge. Calling gevent.subprocess.Popen() triggers an error from the underlying libev, which I don't know what it means.

I wrote upstream and will post here the archive link as soon as it appears (unfortunately the mailing list is Google Groups based, so it behaves in a completely illogical way).

Then another TODO: after the switch to gevent, LogService writes some exception traceback when a service disconnects. It complains that the connection was reset by peer. I have to investigate better to understand the real problem (and what changes from before).

giomasce commented 11 years ago

Here it is my email about gevent.subprocess. No reply so far...

https://groups.google.com/forum/?fromgroups#!topic/gevent/EJ6zoGReuUE

giomasce commented 11 years ago

About gevent.subprocess: it was my fault, I overlooked the fact that Worker wasn't ported to gevent yet. So gevent.subprocess was supposed to run into a different thread from the main libev cycle.

giomasce commented 11 years ago

The first part of the gevent conversion is done: all the services have been migrated and AsyncLibrary and WebAsyncLibrary have been deleted.

Now the second part: fix everything that still requires some care. For example, ScoringService still has its awful thread and related hacks.

Moreover, other two TODOs that I write here in order not to forget: check that documentation (both docstrings and user docs) is updated (I usually fixed comments and docstrings that I saw, but something may have slipped) and update and check the IO test suite.

giomasce commented 11 years ago

Here it is the thread that I spawned on the mailing list pgsql-hackers:

http://www.postgresql.org/message-id/51AF958C.3060405@gmail.com

giomasce commented 11 years ago

As they suggest on the PostgreSQL mailing list [1], one can call large objects functions simply by enclosing them in SELECT statements. I had seen this possiblity, but I had misread the documentation and thought that this wasn't possible for lo_read and lo_write (actually, one must know that they're spelled differently: loread and lowrite). So, apparently we can fix this problem without making anything strange. I wonder why this isn't explicitly said nowhere, it seems the more or less no one has this problem.

[1] http://www.postgresql.org/message-id/CAAfz9KOdNh9VXPYKQMf8eO4P1jJnpMOhpOrmwrcvBedCta7CNA@mail.gmail.com

giomasce commented 11 years ago

I started implementing this idea and it seems to mostly work. But I did very little testing, because I'm rather tired. My whole commit of this evening will require some other proof reading and testing.

giomasce commented 11 years ago

BTW, the namespace cms.async deserves too some renaming, since our framework isn't asynchronous anymore. Any idea?

lw commented 11 years ago

cms.io? cms.communication? cms.event? cms.net? cms.network? cms.requests?

giomasce commented 11 years ago

cms.io, because is the shortest! :-)

giomasce commented 11 years ago

Ok, the gevent branch is becoming, in my opinion, rather close to be ready for merging in master; there will probably be other changes, but rather minor ones.

Luca, did you find some time to have a look at my patches? Do you already have ideas to share?

lw commented 11 years ago

Nope, I didn't. But I saw that I'm probably not the right person to do a proper code review since I never looked at that code before and I don't know how it's supposed to work (and therefore if any of your changes alter its behavior). I'll still look at it since I want to learn it, but I think it's better if someone else does the review. @stefano-maggiolo is the only one that comes to my mind...

giomasce commented 11 years ago

I don't know whether he has time, but let's hope...

stefano-maggiolo commented 11 years ago

Will take a look. I thought this was for 1.>1, did you change opinion?

On 11 June 2013 19:07, Giovanni Mascellani notifications@github.com wrote:

I don't know whether he has time, but let's hope...

— Reply to this email directly or view it on GitHubhttps://github.com/cms-dev/cms/issues/156#issuecomment-19281271 .

giomasce commented 11 years ago

Well, given that I'm pretty satisfied of how it came out, I wouldn't dislike accepting it for 1.1. I'm pretty sure that, with the possible exception of some minor bug still to work out, it brings by far more pros than cons. Of course, let's decide together.

giomasce commented 11 years ago

One of the factors against the acceptation could be the requirement of gevent 1.0, which is still in development (BTW, I just pushed a commit actually documenting this new dependency, with installation instructions).

lw commented 11 years ago

Mmh... any idea when 1.0 will be released? If it makes its way into Ubuntu 13.10 we can target that distribution (instead of 13.04) and, perhaps, also have a non-suid sandbox (I just asked that in its thread) and use SQLAlchemy 0.8.

By releasing v1.1 around (or after) October we will have time to merge back the fixes from IOI and do quite some testing...

giomasce commented 11 years ago

No, I have no idea. It is already in RC2, but the development of gevent is rather discontinuous. Actually, RC2 was release 6 months ago. There has been some work (although not much) since then, but I have no clue about when 1.0 could be released. I don't think it is entering Ubuntu 13.10.

lw commented 11 years ago

In this case I'm not that much in favor of requiring it. Apart from possible instabilities (RC2 shouldn't have many, but nevertheless...), I don't find it nice to ask our users to manually install files that will not be tracked by any package manager (it's also a bit more difficult to do automatically by remote on many machines, or to ask your system administrator to do...).

Could including (a copy of) it in our repository be a temporary solution? (kind of what we do with jQuery...)

giomasce commented 11 years ago

I deeply hate convenience copies of external software (which, actually, is true also for the JS libraries we have). And I don't see how having to install gevent from our repository is significantly more complicated than installing it from another one. About files not tracked by a package manager: we require it for CMS anyway.

(about bugs: consider that our current solution is surely buggy and unstable)

giomasce commented 11 years ago

I implemented a lot of Stefano's suggestions. I therefore think that branch gevent is ready for merging into master.

To do the actual merge I would like to have your approval, also in consideration of the last comments in this issue, that touch things upon which we didn't really find a shared position (at least, it is not evident that we did).

lw commented 11 years ago

I suppose "you" means me. I'm actually in favor of merging it, since the benefits largely outweigh the drawbacks (hey, if not having all dependencies provided by your package manager is out of your comfort zone then don't use CMS at all since there aren't packages for it either...)

Yet, I'm still thinking about how to make this easier... A PPA [1] for gevent would help a lot, but there doesn't seem to be one. Can we provide it? (And, with that momentum, one for CMS itself? It wouldn't harm...)

Non-Ubuntu users will have to stick with easy_install (provided that it manages also version 1.0rc2) or with manual installation, if nobody has better ideas...

[1] https://help.launchpad.net/Packaging/PPA

giomasce commented 11 years ago

Ok, so let's say that the merge is accepted. I'll do it later today.

For the PPA, Bernard had already started to work on it. And I had a look at it too. We'll address this issue too (although, probably, Bernard has something different to think about, at the moment; and, by any mean, it is possible that he will want to avoid even to think about CMS for a while after IOI 2013).

giomasce commented 11 years ago

Pushed! Wow, now let's wait for the bugs to come out!