YosysHQ / yosys

Yosys Open SYnthesis Suite
https://yosyshq.net/yosys/
ISC License
3.3k stars 860 forks source link

Area and timing constraints for ASIC synthesis #4257

Open wyt6139365 opened 4 months ago

wyt6139365 commented 4 months ago

Feature Description

I checked the Yosys mannual and found little info about following issues:

phsauter commented 3 months ago

Yosys itself is not timing-driven and does not consider it. Most things are aimed at reducing logic (which can even hurt timings eg. when it merges important paths) or have fixed implementations that try to strike some balance (eg booth pass for multipliers).

However, at the end of the script you will run the abc command, which in turn calls the integrated ABC logic optimization tool. It also uses a script to guide its optimizations (see abc command for the defaults).
In ABC, there are some commands that do accept delay-goals which you can set via the -D flag (it just replaces the {D} placeholder in the script.
So when trying to improve timings and area you need to try (or write) different ABC scripts. You can find some in OpenRoad-flow-scripts and OpenLane. You can achieve reasonable results with them.
On a larger design (1-1.5 million NAND-gate equivalent) we get pretty good performance by using Lazy Mans Synthesis but I would recommend you contact me directly if you are working on a more serious project, then we can talk about it in detail.

Another important note: that ABC will only ever get one module (and if you use -dff for sequential optimization, only one unique clock, enable and reset combination) at once. This makes it rather important to consider to which extent you should flatten your hierarchy. For smaller designs you will likely get the best performance if you flatten everything, for larger designs this isn't always the case.
If you really need to use -dff (I would recommend you don't), then you will likely want to use dfflibmap, dfflegalize and maybe even dffunmap to remove built-in enable and synchronous-resets from the FFs and mapping them into logic (MUXes), this reduces the number of clock domains and increases the average size of the netlists given to ABC, thus increasing the scope it can optimize, improving results (in general).