gobezu / jbetolo

Joomla! front end performance optimization with support for CDNs
7 stars 8 forks source link

Multiple web processes executing jbetolo compression simultaneously and overloading server. #32

Open brunoald opened 11 years ago

brunoald commented 11 years ago

Hello! When I run a benchmark or activate jbetolo in a high traffic website, and jbetolo was not "compiled" yet, all the active httpd processes invoke jbetolo scripts and the server become very slow.

I think that it's necessary to run jbetolo compression scripts only ONCE, and only by the first request that invoked it. The following requests should detect that it is still being executed and do not invoke it.

gobezu commented 11 years ago

Unfortunately working with parallel threads from another one and as a result do this or that is just not possible, as it will easily create race condition, an even worse issue to try to deal with at PHP level.

On high traffic website if you hit jbetolo that often it means you are hitting the actual web server/db server/PHP at least as frequent as jbetolo, which means the setup needs tad optimization with proxy server and more.

But I do agree that I need to take another stab at performance, which I intend to do for the upcoming main release.

Thanks!

brunoald commented 11 years ago

I'm thinking in a solution that may not be perfect, but it may reduce the problem. In my opinion, the problem is that while the jbetolo compression is working (in my app it takes about 20 ~ 30 seconds), other web process that handle requests for the same URL (or set of assets) in this period will trigger jbetolo scripts. If you use a flag, these following requests may skip jbetolo. In a scenario of high concurrency, maybe the flag creation takes more time than the next request to check whether it exists, but it still fine, because just a few requests will trigger jbetolo again, not all of them within these 20 ~ 30 seconds. 2 or 3 requests executing jbetolo for the same URL is "acceptable", but imagine 50 doing the same. The server becomes unresponsible in my case.

My implementation suggestion is: When the compressing process start for a given set of assets, a record in the database (or a file in temp folder) would be created. When process finishes, the record would be eliminated. It works like a "flag". Every request checks whether the "flag" exists - in positive case, it means that another process is currently executing jbetolo and them the process is skiped, rendering uncompressed assets, as if jbetolo was not activated yet (or you can "freeze" the request while compressing is not finished yet, but I don't consider this is a good idea). This way, while a web process is compressing the assets, and ONLY it is working on this, the other requests detect this and do not attempt to do the same.

gobezu commented 11 years ago

Okay, +20secs just defeats the purpose of having jbetolo, which after all is improved performance. Do you mind reaching me through my websites contact form and we instead investigate the cause of that? Or maybe you have already done that and have some clue? My website is jproven.com.

Sophist-UK commented 10 years ago

I think this might be quite difficult - I am not sure that web servers (like apache) have the proper guaranteed locking that is needed.

The algorithm that I imagine would work like this:

  1. First thread would calculate the hash and start to create hash.jstmp or has.csstmp, renaming it to hash.css or hash.js when it was complete.
  2. Threads would check for hash.js or hash.css first to use as a cached file, and if it didn't exist it would check for the tmp file, and if that existed then it would leave the base code as is for this particular call.

This would probably work in most situations, however if the web calls came in at the exact same time on a multi-core server, then both threads could check for the file existence / non-existence at the same time and still end up both trying to write the file at the same time.