Closed phindmarsh closed 12 years ago
Actually, I suspect that this has to do with the way that Redmine handles change-sets for git. It is very labor intensive, writing two entries to the database for every git-level commit in the log (one new entry to the changes
table and a modifying of the repository_extras
entry of the repository
table). This huge amount of traffic is why the generic version of Eric's plugin doesn't work well, since it is triggered for every write to the repository table. It led to my "performance" branch of this plugin (last year), which changed things to trigger the plugin once per fetch_changesets
, rather than per write to the repository
table.
Thus, my guess is that this is simply a Redmine issue. It depends on the number of new git-level commits introduced by fetch_changesets
, rather than by the number of repositories. Perhaps the plugin makes it slightly worse, since it (the plugin) registers an observer on the repository
table, leading to slightly more code per change to that table. I'm assuming that things work fine now that you are past the setup time?
Incidentally, the "performance" branch is fully integrated into the master branch, so you have all of that code currently working for you...
Actually, we were noticing the lag even days after the migration occurred. We haven't had it happen again for a day or so, but I'll keep an eye on it and see if it pops up again. The times we noticed it we certainly weren't making changes (committing etc) to any repository, which first lead me to check we didn't have any cron jobs running that could be causing it (which we didn't).
Thanks for the insight, I'll keep an eye on it.
What is a bit weird is the three threads with 100%. How frequently are you running fetch_changesets
? If you run it too frequently, perhaps you have more than one running at a time...? (I wouldn't do so more frequently than 15 minutes or so). You should be able to see the total time for fetch_changesets operations in the log -- look for the fetch_changesets
header on a log entry, then look at the end of the "paragraph" to see how long it took...
You should probably upgrade to version 0.4.5x of this plugin, incidentally...
There was three threads totaling 100% (roughly 33% each).
AFAIK we do not run fetch_changesets
manually (or via cron etc), so I imagine it only runs when instigated via the UI?
Ah. You should try running fetch_changesets
regularly (via a cron job). Is it possible that there were changes that had built up? One other benefit of running fetch_changsets
regularly is that it tends to correct errors that might have built up (although that doesn't seem to be giving you a problem).
The plugin does attempt to "cache" its interactions with git. I'm not sure how many repos can be cached at a time. Perhaps you have folks looking at multiple repositories in rapid succession, causing a bunch of cache traffic to the database (cache entries are stored in the DB). Perhaps you could see if you could stimulate the problem by looking at a bunch of different repos in rapid succession (from different browsers?).
If you remember what date/time this problem was in, you should look in the log (i.e.
REDMINE_ROOT/log/production.log
Can you figure out what was happening then?
Ok I'll get a cron job started.
There is only two of us using the repositories at any one time, and on a few instances we were coming back to Redmine from a period of inactivity. When we loaded the site it would take ages to load, if I SSH'd into the box (which would be slower than normal) and use top
I could see the processes having a good time.
I did check the log when it was happening and couldn't see any activity from when it was being slow, but I'll take another look.
Some clarification: when you say that the site took "ages to load", do you mean the whole Redmine site, or just the Repository page? (Different causes).
Are you using Apache+FCGID? That will timeout old threads after long periods of inactivity; they take time to restart (can be 10 seconds or more). The fetch_changesets
running through the web (need to set a key) will keep at least one thread alive.
Apologies, the entire site.
BUT I think it was any service that was running on that server, not just redmine (i.e. SSH took a approx 30 seconds to log in as opposed to the typical 2 seconds). This leads me to thing its just out of resources to cater for the additional requests once whatever process it is starts hogging the CPU.
We have Apache+Passenger to Redmine 1.3-stable with the latest copy of this plugin.
So -- perhaps it needs a reboot? Are you thinking that you can close this issue? Or, do you want to keep it open?
Yeah I rebooted it a few times when it was misbehaving with no effect.
I'll close the issue, I was just looking for some insight into how the plugin does things when it comes to large/many repositories.
I'll keep and eye on it and if it happens again I'll re-open the issue.
Thanks for your help!
Ok. I would try to run fetch_changesets
from a cron job (say through the web) and look in the log to see how long it takes to process. That would be an interesting insight to how the plugin is doing (as opposed to Redmine in general).
You would have the cron job execute something like:
*/15 * * * * apache curl -k -o/dev/null -s "https://yourserver.com/redmine_root/sys/fetch_changesets?key=xxxxxx"
Here, the key is the API key under general redmine settings (with the repository tab). It is there to avoid a denial of service attack.... (the above runs the command as user apache, executes a http request to the given URL....). You can also execute such a command from a browser to experiment with fetch_changesets
as well (I do that a lot with a tail -f on the redmine log file).
Is that the same as doing it via a rake
task? I have a cron running every 15 minutes that does:
rake redmine_git_hosting:fetch_changesets RAILS_ENV=production
Should be the same. With the rake task, it is absolutely important that you run as the same user that redmine normally uses!
The rake task will be labeled in the log file with the fact that you are executing "from the command line". The one thing that the rake task won't give you is timing in the log file... (Redmine puts a line at the end of each request in the log to say how long it took).
Certainly, the cron specifies www-data as the operating user and I've checked it does in fact run in that manner.
Thats a good point though, I'll change it to use curl.
I'm not sure if this is related to this plugin particularly or not, I hope you will be able to provide some insight as to whether it could be the cause or not.
Recently we moved all our sites (we are a small web development shop) into git repositories while migrating to a new production hosting environment. There were a few sites (notably old ones) with lots of images and PDF documents that make the repository pretty large (excess of 400mb).
Since those repos have been added into our Redmine environment (now operating with Redmine 1.3-stable from SVN) occasionally the server becomes unresponsive. I have noticed that there are (up to 3) ruby threads consuming 100% of the CPU for minutes or more. Then just afterwards mysqld spikes for a ~10-20 seconds, then everything drops back to idle.
I assume this is the
fetch_changesets
task going through each repository looking for changes to each repo, then adding them to the database. It appearsmysqld
only spikes right at the end and occasionallygit
spikes but only for a second or so.We have around 35 repositories, only a few of them are >100mb, and the
changes
table has approx90,000
rows in it. Does this sound like something that could happen with the plugin? It only seemed to manifest itself after adding all the new repos, although the large ones were added at the same time as the others so I can't separate if its a 'number of repositories' over a 'repository size' problem.Thanks for taking a look, we love this plugin and it certainly does make our workflow and server management significantly easier, as always your work is much appreciated.