YosysHQ / yosys

Yosys Open SYnthesis Suite
https://yosyshq.net/yosys/
ISC License
3.37k stars 874 forks source link

ASTNode::simplify has supralinear performance with deep nesting of expressions #4562

Open whitequark opened 3 weeks ago

whitequark commented 3 weeks ago

Version

Yosys 0.39+1 (git sha1 f7153573c, ccache clang++ 14.0.6 -O0 -fPIC)

On which OS did this happen?

Linux

Reproduction Steps

Run yosys stackoverflow.v -p hierarchy while commenting out some of the 512 levels of nesting (with stackoverflow.v being the example from https://github.com/YoWASP/yosys/issues/37) and observe time plus stack usasge, for example with valgrind --tool=drd --main-stacksize=16777216 --show-stack-usage=yes.

Expected Behavior

Time is linear in input size.

Actual Behavior

Time is quadratic in input size:

Screenshot_20240822_190839

(link to the spreadsheet)

LibreOffice Calc provides the following polynomial fit:

Screenshot_20240822_190905

Also, using 24.5K per nesting level seems excessive and results in issues such as https://github.com/YoWASP/yosys/issues/37.

widlarizer commented 3 weeks ago

using 24.5K per nesting level

I ran massif on stackoverflow.v. In the peak snapshot, I can see

92.53% (3,345,110B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->32.73% (1,183,104B) 0x720AB4: Yosys::AST::AstNode::clone() const (frontends/ast/ast.cc:253)
| ->32.69% (1,181,664B) 0x720BB7: Yosys::AST::AstNode::clone() const (frontends/ast/ast.cc:256)
| | ->32.63% (1,179,648B) 0x720BB7: Yosys::AST::AstNode::clone() const (frontends/ast/ast.cc:256)

and so on, there's a lot of deep copies of nodes. There's also a lot of memory spent on parser-constructed AstNodes. Edited massif.log. The sizeof(AstNode) is 288B according to a dump I did some weeks back. This means we need 85 AstNodes per ternary nesting level.

whitequark commented 3 weeks ago

That's 24.5K of stack memory, not heap memory. (I'm sure heap use isn't great either, but that's not what I'm talking about.)

You only get 1M of stack on Windows and macOS, so beyond 80 levels or so Yosys crashes.

widlarizer commented 3 weeks ago

...right, of course. valgrind --tool=massif --heap=no --stacks=yes ./yosys garbage/stackoverflow.v -p "hierarchy" says we get up to 1,335,808 bytes of stack, but can't attribute it to its sources. perf shows that AstNode::detectSignWidthWorker takes up 90% of the runtime.

widlarizer commented 3 days ago

Just noticed that we're raising the stack size in a linux ifdef in kernel/rtlil.cc