ucb-bar / chipyard

An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more
https://chipyard.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
1.65k stars 655 forks source link

Genus segmentation fault in ChipTop synthesis #722

Closed baichen318 closed 3 years ago

baichen318 commented 4 years ago

Hi all,

I have a question about the VLSI flow in Chipyard.

I find that it always fails when I use genus to synthesize the ChipTop module, the top module of a SoC, e.g., the BOOM SoC with SmallBoomConfig. Everytime, genus reports segmentation fault and then exits abnormally after a long time. My genus version is 17.14-s037_1

If you have encountered such a problem, I was wondering how you solve it. Thank you very much.

Best regards, Chen

lsteveol commented 4 years ago

How much ram does the server have? On a 4 core BigRocket with 2MB of L2 I use around 14GB of memory and the run takes around 9 hours. This is with the memories replaced. Letting the memories synthesize as flops took about 50% longer and I believe I was approaching 25-30GB of RAM usage.

I also don't use the ChipTop but the DigitalTop as the synthesis top level.

baichen318 commented 4 years ago

How much ram does the server have? On a 4 core BigRocket with 2MB of L2 I use around 14GB of memory and the run takes around 9 hours. This is with the memories replaced. Letting the memories synthesize as flops took about 50% longer and I believe I was approaching 25-30GB of RAM usage.

I also don't use the ChipTop but the DigitalTop as the synthesis top level.

Hi @lsteveol ,

Really thank you for your attention.

My server has available memory more than 500G so I believe that it may not be the memory limit. My genus would run a lot of time in distributing super-thread jobs after the super-thread servers finish loading libraries. After a long time, genus will exit abnormally.

I was wondering if you use the command make buildfile MACROCOMPILER_MODE='--mode synflops' CONFIG=Sha3RocketConfig VLSI_TOP=ChipTop under the vlsi folder to synthesize ChipTop?

Thank you very much!

Best regards, Chen

colinschmidt commented 4 years ago

In my experience a tool segfault is most commonly solved by updating tool versions, especially when you are using a well tested flow that is known to work such as the Hammer Cadence flow.

Things to try: Do you have access to a newer version of the tool? Can you synthesize other non-BOOM designs?

baichen318 commented 3 years ago

Hi @colinschmidt

Really thank you for your attention. Regarding to your question,

  1. I was wondering what kinds of the version of Genus you use to synthesize ChipTop? I would like to try that version. Thanks.
  2. I can synthesize other non-BOOM designs like the SHA3 accelerator with my current Genus. It is an example provided by the document (i.e., https://chipyard.readthedocs.io/en/latest/VLSI/Tutorial.html).

Thank you very much!

Best regards, Chen

colinschmidt commented 3 years ago

I have been using 18.13 recently, but I'm not sure how up to date my chipyard commit is when I was doing that, and I haven't synthesized Boom recently.

If you can synthesize other things I would assume you might be hitting a tool bug so trying a newer version of Genus is a good idea. You could also trying changing the boom config to see if that helps you avoid this unusual bug.

baichen318 commented 3 years ago

I have been using 18.13 recently, but I'm not sure how up to date my chipyard commit is when I was doing that, and I haven't synthesized Boom recently.

If you can synthesize other things I would assume you might be hitting a tool bug so trying a newer version of Genus is a good idea. You could also trying changing the boom config to see if that helps you avoid this unusual bug.

Really thanks for your information.

In the vlsi folder, I use the command make syn MACROCOMPILER_MODE='--mode synflops' CONFIG=SmallBoomConfig to do the synthesis. Someone suggests that ASAP7 doesn't provide vendor memories, so it may require too much effort to synthesize a large L2 cache (genus would use flip-flops to replace the vendor memories) which may lead to the crash. I was wondering how I can incorporate vendor memories with ASAP7 to help to synthesize?

Thank you very much!

colinschmidt commented 3 years ago

If you think it is the L2 cache causing issues you can build a configuration without it to ensure that is the issue. ASAP7 doesn't have vendor memories available so you will either have to figure out someway to build them yourself (very difficult but technically possible see this paper: https://ieeexplore.ieee.org/abstract/document/8050316) or switch to another technology that has SRAMs already provided. Commercial technologies will have SRAMs but you could also see if any of the other open-source PDKs provide them, if commercial techs are unavailable to you.

harrisonliew commented 3 years ago

Actually, if you are just doing design space exploration, you are welcome to try using the dummy SRAMs we made for ASAP7 for the purposes of teaching a class. They have no layout in them and have wonky timing models, so in our experience, we found that gate-level simulation will often disagree with synthesis and P&R.

There is a description of them here: https://github.com/ucb-bar/hammer/tree/master/src/hammer-vlsi/technology/asap7 and to use them, you will need to target the sram-cache.json file for MacroCompiler and the sram_compiler folder in the Hammer IR key vlsi.core.sram_generator_tool_path.

lsteveol commented 3 years ago

If you think it is the L2 cache causing issues you can build a configuration without it to ensure that is the issue. ASAP7 doesn't have vendor memories available so you will either have to figure out someway to build them yourself (very difficult but technically possible see this paper: https://ieeexplore.ieee.org/abstract/document/8050316) or switch to another technology that has SRAMs already provided. Commercial technologies will have SRAMs but you could also see if any of the other open-source PDKs provide them, if commercial techs are unavailable to you.

I did not think about this earlier since in my case it wasn't a segfault, but I had an issue where the TLMonitors were causing the Elaboration Stage to hang (15+ hours). I ripped them out through the Config and synthesis proceeded as normal. It was strange as I had included them in a prior run without any issue.

baichen318 commented 3 years ago

If you think it is the L2 cache causing issues you can build a configuration without it to ensure that is the issue. ASAP7 doesn't have vendor memories available so you will either have to figure out someway to build them yourself (very difficult but technically possible see this paper: https://ieeexplore.ieee.org/abstract/document/8050316) or switch to another technology that has SRAMs already provided. Commercial technologies will have SRAMs but you could also see if any of the other open-source PDKs provide them, if commercial techs are unavailable to you.

Really thank you for your advice. I have removed L2 Cache from configurations and re-synthesize it again. This time, genus takes around 3 hours to synthesize ChipTop and finally I get the correspoinding *.mapped.v file. From my point of view, since I don't have vendor memories, genus will run lots of time to synthesize L2 cache and finally it crashes and reports the segmentation fault. In order to synthesize from ChipTop with ASAP7, yes, I need vendor memories or self-built ones, the latter of which is too difficult.

baichen318 commented 3 years ago

Hi @harrisonliew

Really thank you for your attention.

I have tried those dummy SRAMs but failed in synthesizing from the ChipTop module. I was wondering if it is due to my procedures.

First, I use sram-cache-gen.py to generate the corresponding Hammer IR (i.e., chipyard.TestHarness.SmallBoomConfig.mems.hammer.json in the log file) with srams.txt as an input (referred to https://github.com/ucb-bar/hammer/tree/master/src/hammer-vlsi/technology/asap7). Then I use MacroCompiler to compile, however, it reports some errors which lead to the failure of synthesis. Sorry I am new to Chipyard, my steps may be wrong to some extent.

Enclosed please find my attachment log file which says that some memories are not supported.

Thank you very much!

macro-compiler.log

baichen318 commented 3 years ago

If you think it is the L2 cache causing issues you can build a configuration without it to ensure that is the issue. ASAP7 doesn't have vendor memories available so you will either have to figure out someway to build them yourself (very difficult but technically possible see this paper: https://ieeexplore.ieee.org/abstract/document/8050316) or switch to another technology that has SRAMs already provided. Commercial technologies will have SRAMs but you could also see if any of the other open-source PDKs provide them, if commercial techs are unavailable to you.

I did not think about this earlier since in my case it wasn't a segfault, but I had an issue where the TLMonitors were causing the Elaboration Stage to hang (15+ hours). I ripped them out through the Config and synthesis proceeded as normal. It was strange as I had included them in a prior run without any issue.

May I ask if you use ASAP7 to synthesize from the ChipTop module? Thank you.

lsteveol commented 3 years ago

If you think it is the L2 cache causing issues you can build a configuration without it to ensure that is the issue. ASAP7 doesn't have vendor memories available so you will either have to figure out someway to build them yourself (very difficult but technically possible see this paper: https://ieeexplore.ieee.org/abstract/document/8050316) or switch to another technology that has SRAMs already provided. Commercial technologies will have SRAMs but you could also see if any of the other open-source PDKs provide them, if commercial techs are unavailable to you.

I did not think about this earlier since in my case it wasn't a segfault, but I had an issue where the TLMonitors were causing the Elaboration Stage to hang (15+ hours). I ripped them out through the Config and synthesis proceeded as normal. It was strange as I had included them in a prior run without any issue.

May I ask if you use ASAP7 to synthesize from the ChipTop module? Thank you.

I synthesize the DigitalTop level (actually I instantiate the DigitalTop with additional logic around it). My setup is a 4 Core LargeRocket with 2MB of L2. Set to 1.5GHz with most of the Busses set to 750MHz and the PBUS set to 100MHz and IO delays set to around 70% of the clock period. It takes Genus about 10 hours to synthesize and uses ~14GB of RAM on an i9-9980XE which is overclocked to 4GHz+

harrisonliew commented 3 years ago

@baichen318,

Thanks for sending over the logs. After digging through my brain again, I realize why we don't recommend users try the dummy SRAMs in Chipyard: the dummy SRAMs don't have the necessary read/write ports required by a Rocket-based system, so your cache arrays will not get mapped properly. This is currently quite low on the TODOs for the ASAP7 plugin, but we might bump it up since this seems to have more external use now. Therefore, you might have to try a different technology.

baichen318 commented 3 years ago

Hi @lsteveol , really thanks for your information.

baichen318 commented 3 years ago

@baichen318,

Thanks for sending over the logs. After digging through my brain again, I realize why we don't recommend users try the dummy SRAMs in Chipyard: the dummy SRAMs don't have the necessary read/write ports required by a Rocket-based system, so your cache arrays will not get mapped properly. This is currently quite low on the TODOs for the ASAP7 plugin, but we might bump it up since this seems to have more external use now. Therefore, you might have to try a different technology.

@harrisonliew , thanks for your help very much.

baichen318 commented 3 years ago

Dear all,

Now I may close the issue since I found the root cause thanks to your help.

Thank you very much again.