Excessive memory usage with many workers

firespring / mogbak

Need to backup your MogileFS? mogbak is here to help. Backup is made to a self contained directory with metadata. Restores are easy too.

http://github.com/firespring/mogbak

MIT License

28 stars 4 forks source link

Excessive memory usage with many workers #12

Open abh opened 12 years ago

abh commented 12 years ago

Running with --workers=100 mogbak is using about 20GB of memory. It also seems to stall using 100% CPU for a while now and then.

jesseangell commented 12 years ago

I can't say I've tried a worker count that high -- can't imagine you'd gain anything on most systems. The memory usage likely has to do with Ruby MRI and threading . I think it's better on Ruby Enterprise, but I haven't tried. I can play with that when I have some time.

The stall likely has something to do with the queuing. It has to mark all the files in sqlite that it's going to backup, then it has to back them up, then it has to go back and remark all those files as updated. That's happening in a single thread because of lock issues with sqlite. With 100 workers it has to do 50,000 inserts into sqlite (no way to bulk insert that I know of) before it can start backing up. The whole queuing system really could use some love.

abh commented 12 years ago

Not sure how important it is. I just tried bumping up the number of workers to see if I could get it to go faster.

On the initial backup I had hoped I could get close to 100MB/second to get it done in a reasonable amount of time, but we don't get that many new files (5-10 GB an hour over a few hundred files or some such) so long-term it doesn't really matter.

abh commented 12 years ago

Just to add to this, I guess the bigger point was that it seems a little crazy that a process talking to a database and to mogilefs needs to use 200MB memory. Maybe that's just the ruby way? :-)

jesseangell commented 12 years ago

Well. Some of the gems do indeed eat some memory -- although I do suspect the high memory usage has something to do with the threading. I suspect using REE instead of Ruby MRI with copy_on_write enabled would reduce it. When I get a chance I'll play with things and see where all that memory is going.