Add dynamic checkers for excessive memory consumption and pending task backlog

pdziepak commented 6 years ago

Seastar provides dynamic reactor stall detector and large allocation detector both of which have proven to be immensely useful tools.

There are also GDB scripts that allow analysing core dumps to get the idea what is causing excessive memory allocations (especially when the number of allocations is huge but the size of an individual one is small) or what kind of tasks are waiting in an overly large pending task queues. There are, however, two problems with this approach:

A developer needs to check the metrics to see if allocations or pending tasks are the problem and then run an appropriate command in the GDB, none of which actually requires a human brain to be involved.
A core dump capturing the problematic situation may be not available, e.g. the server has eventually managed to recover.

A possible solution may be to provide additional dynamic checkers in seastar. For example: if a queue of pending tasks grows above certain threshold seastar will log pointers to vtables of the most common ones. With excessive memory consumption that may be more tricky as the main user probably will be Scylla with its LSA allocator so the triggering point should depend on LSA vs. non-LSA memory usage ration, I guess to make it a general seastar solution it would have to know more about how the application is using memory.

Checkers like this could be potentially added to any sort of queue that can become excessively large and there is some useful information that could be reported (IO queues with large delays perhaps could tell us what kind of requests are being submitted).

The biggest problem is definitely the cost of such dynamic checkers (not so much checking for the trigger as the actual analysis of the application state if the situation is bad). Even if the reporting is limited to happen not more often than every N minutes preparing a single report may be an expensive operation and will happen when the server is already very stressed. This doesn't matter if the program is not going to recover anyway, but there is no way of telling that. We could make triggering threshold configurable (and use much lower ones in our own tests) or limit the accuracy of the reports (e.g. present information based on a random sample instead of full data).

avikivity commented 6 years ago

Tasks might not be in task queues. However, we can keep an active count of live tasks in task constructor/destructor and check if it's "too high". However, then we have no idea where they are.

pdziepak commented 6 years ago

If a task is not in a task queue then it is not pending and not a problem for a tool that is supposed to analyse situations in which the reactor is overloaded with pending tasks. When the number of not-pending tasks grows too much we will end up with excessive memory consumption and that checker will get us the vtables of small objects that are the problem.

Basically, what I am proposing is to make Scylla's GDB commands scylla task_stats and scylla task_histogram[1] dynamic checkers implemented in seastar and triggered in certain condition. Once we encounter more problems which debugging could be, at least partially, automated we could add more dynamic verification.

[1] The name of this one seems to be inaccurate, it prints vtables of small objects, which just happen to be tasks most of the time.

gleb-cloudius commented 6 years ago

On Wed, Oct 18, 2017 at 03:58:16AM -0700, Paweł Dziepak wrote:

If a task is not in a task queue then it is not pending and not a problem for a tool that is supposed to analyse situations in which the reactor is overloaded with pending tasks. When the number of not-pending tasks grows too much we will end up with excessive memory consumption and that checker will get us the vtables of small objects that are the problem.

Basically, what I am proposing is to make Scylla's GDB commands scylla task_stats and scylla task_histogram[1] dynamic checkers implemented in seastar and triggered in certain condition. Once we encounter more problems which debugging could be, at least partially, automated we could add more dynamic verification.

[1] The name of this one seems to be inaccurate, it prints vtables of small objects, which just happen to be tasks most of the time.

Disclaimer: Avi came up with the name. I happen to agree with you.

-- Gleb.

tgrabiec commented 6 years ago

2017-10-18 12:58 GMT+02:00 Paweł Dziepak notifications@github.com:

If a task is not in a task queue then it is not pending and not a problem for a tool that is supposed to analyse situations in which the reactor is overloaded with pending tasks. When the number of not-pending tasks grows too much we will end up with excessive memory consumption and that checker will get us the vtables of small objects that are the problem.

Basically, what I am proposing is to make Scylla's GDB commands scylla task_stats and scylla task_histogram[1] dynamic checkers implemented in seastar and triggered in certain condition.

For pending tasks, the command is "scylla task-stats".

Once we encounter more problems which debugging could be, at least partially, automated we could add more dynamic verification.

[1] The name of this one seems to be inaccurate, it prints vtables of small objects, which just happen to be tasks most of the time.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/scylladb/seastar/issues/348#issuecomment-337552469, or mute the thread https://github.com/notifications/unsubscribe-auth/AARUL2JMvLB5MRqmIwWUwBKJvQkVx-8Cks5stdnDgaJpZM4P8TK6 .

scylladb / seastar

Add dynamic checkers for excessive memory consumption and pending task backlog #348