RFC / Issue: Performance Bottlenecks

Currently there are a number of performance bottlenecks in Souffle. The following is a partial list of places where we have unacceptable run-time complexity. These can be triggered when processing sufficiently large programs (e.g. 7k+ clauses).

General Issues:

Analyses aren't incremental. Right now any transform change invalidates all analyses. This is painful, since a tiny local change invalidates expensive program-wide analyses (e.g. type-checking). As a compromise perhaps it would be possible to have clause-level caching of some analyses?

Specific Issue Instances:

RecursiveClauses: The implementation is absurdly naive and expensive (brute force recursion check). Proposed fix: Compute a proper SCC. Generalise RecursiveRelation into a AnalysisSCC and share impl. PR is in the works.
getLeastCommonSupertypes: Iterates over every type in the environment on each invocation. This is called every type system constraint solver iterator (if a constraint makes use of it). Hacky fix: This can be dramatically improved by adding a proper subtype DAG + walking upwards (O(min(m, n)) instead of O(|T|) where m, n is the subtype depths. In practice m and n tends to be very small. A PR can be made available. Proper fix: The type system is awful, this is known. Perhaps we can implement a HM inference solver rather than use a more general fixed point constraint solver?
AstQualifiedName::operator<: This is a moderately hot function since AstQualifiedName is used as a key in lookup maps. In these cases we don't need lexical ordering, any ordering will do (or a hash, if unordered_map). Proposed Fix: Add qualified name interning, use the interned ID instead for O(1) mapping. I've a patch that does global interning. Each AstQualifiedName memoizes their interned ID on request. PR for proposed fix can be made available.
visitDepthFirst: Nothing wrong with it, but in several cases they're used for exists-like computations and could bail out early.
ephemeral large vector copies: In the AST there are a number of places where instead of returning vec<ptr<T>> const& we produce and return vec<T*>&. AFAIK the motivation for this is to minimize coupling between who owns what in the AST. This isn't too bad for small vectors accessed infrequently, but does show up for us in cases such as AstProgram::getRelations and AstProgram::getClauses, and their clients (e.g. the O(n) one-shot helpers in AstUtils.h, ProvenanceTransformer, etc.). Proposal: 1) Return vec<ptr<T>> const& for AstProgram. Generally speaking I'm not convinced the AST shuffles around that much, and it certainly won't for AstProgram since that's the root. 2a) Option 1) Kill off the helpers in AstUtils. The RelationDetailCache analysis assembles lookup tables. Other transforms should use it instead. 2b) Option 2) Kill off the helper transform and store AstRelation and AstClause in a by-name lookup multimap. (Effectively inline RelationDetailCache into AstProgram.) I'm leaning towards 2b. Can provide PR. Any other opinions?

Hi,

I really advocate clause-level caching. If I understood what you mean correctly, this would involve (perhaps with a directive "cached") specifying cached rule or relation and thus writing it to somewhere (say disk as a first go) when it is fully computed, so that it can be read in as an EDB in the next computation, if none of its dependencies (in the SCC graph) are changed.

I know David Zhao is working on more sophisticated incrementalism but the clause level would be a big help and something I've been advocating for. Thanks for bringing it up.

Cheers,

Paul

On Thu, Apr 9, 2020 at 11:17 PM ohamel-softwaresecure < notifications@github.com> wrote:

Currently there are a number of performance bottlenecks in Souffle. The following is a partial list of places where we have unacceptable run-time complexity. These can be triggered when processing sufficiently large programs (e.g. 7k+ clauses).

General Issues:

Analyses aren't incremental. Right now any transform change invalidates all analyses. This is painful, since a tiny local change invalidates expensive program-wide analyses (e.g. type-checking). As a compromise perhaps it would be possible to have clause-level caching of some analyses?

Specific Issue Instances:

-

RecursiveClauses: The implementation is absurdly naive and expensive (brute force recursion check). Proposed fix: Compute a proper SCC. Generalise RecursiveRelation into a AnalysisSCC and share impl. PR is in the works.

getLeastCommonSupertypes: Iterates over every type in the environment on each invocation. This is called every type system constraint solver iterator (if a constraint makes use of it). Hacky fix: This can be dramatically improved by adding a proper subtype DAG + walking upwards (O(min(m, n)) instead of O(|T|) where m, n is the subtype depths. In practice m and n tends to be very small. A PR can be made available. Proper fix: The type system is awful, this is known. Perhaps we can implement a HM inference solver rather than use a more general fixed point constraint solver?

AstQualifiedName::operator<: This is a moderately hot function since AstQualifiedName is used as a key in lookup maps. In these cases we don't need lexical ordering, any ordering will do (or a hash, if unordered_map). Proposed Fix: Add qualified name interning, use the interned ID instead for O(1) mapping. I've a patch that does global interning. Each AstQualifiedName memoizes their interned ID on request. PR for proposed fix can be made available.

visitDepthFirst: Nothing wrong with it, but in several cases they're used for exists-like computations and could bail out early.

ephemeral large vector copies: In the AST there are a number of places where instead of returning vec<ptr> const& we produce and return vec<T>&. AFAIK the motivation for this is to minimize coupling between who owns what in the AST. This isn't too bad for small vectors accessed infrequently, but does show up for us in cases such as AstProgram::getRelations and AstProgram::getClauses, and their clients (e.g. the O(n) one-shot helpers in AstUtils.h, ProvenanceTransformer, etc.). Proposal:*

Return vec<ptr> const& for AstProgram. Generally speaking I'm not convinced the AST shuffles around that much, and it certainly won't for AstProgram since that's the root. 2a) Option 1) Kill off the helpers in AstUtils. The RelationDetailCache analysis assembles lookup tables. Other transforms should use it instead. 2b) Option 2) Kill off the helper transform and store AstRelation and AstClause in a by-name lookup multimap. (Effectively inline RelationDetailCache into AstProgram.) I'm leaning towards 2b. Can provide PR. Any other opinions?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/souffle-lang/souffle/issues/1406, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC3OOIDR5PJYZ4TS7CNNI3RLY3QHANCNFSM4MFBCRWQ .

These are performance bugs in Souffle without executing the code (parsing / checking / analysing etc.). We did not have seriously large codebases so far. We saw some issues in GrammarTech's ddisasm project recently related to performance bugs in the semantic checker.

We had issues withRecursiveClause in the semantic checker in the past and crept upon us with the latest refactoring. AST was seriously polluted and we made it more or less PODS. However, this had some implications on our way of how clauses of relations are found. I definitely agree that we need to fix this asap.
- The typesystem is not very efficiently implemented / we use a very naive algorithm for finding a solution for the constraint-based logic solver used for deducing types of AST nodes in rules. Alex Aiken (from Stanford Uni) has very nice papers about this topic.
- We required the AstQualifiedName::operator<: so that error messages are recorded in order and we obtain the same error messages deterministically. If we don't have a deterministic output, we cannot pass our regression tests. I think we need to be more careful to distinguish between reporting containers/sets where we require a strict lexicographical order and where we don't need them for performance reasons.
- The AST Tree walks are very crude at the moment. It would be great to have notions of CUT operations, i.e., we stop searching as soon as we have the first hit. This requires some work. Also we could improve the naming of the tree traversals and it would be great to refactor some of this code that is not used at all. It is in some cases overly complicated.
The method getChildNodes() retrieves the pointers of the sub-trees/getters translate smart-pointer vectors to const vectors etc. In our design, there is always the issue related to smart-pointers vs const references to objects. If we made a change, this would require a more general solution. The question is whether we continue using smart-pointers (NB: they are great because we have hardly any memory issues, e.g., memory leaks etc.) or we have notions automated garbage collection implemented for AST / RAM. This was inflicted by the switch to smart-pointers.
- AstProgram: we were here quite academic. We wanted to have consistency in the design of AST. If it is not practical / hurts us too much, we should revert.

souffle-lang / souffle

RFC / Issue: Performance Bottlenecks #1406

RecursiveClauses: The implementation is absurdly naive and expensive (brute force recursion check). Proposed fix: Compute a proper SCC. Generalise RecursiveRelation into a AnalysisSCC and share impl. PR is in the works.

visitDepthFirst: Nothing wrong with it, but in several cases they're used for exists-like computations and could bail out early.