ucb-bar / hammer

Hammer: Highly Agile Masks Made Effortlessly from RTL
BSD 3-Clause "New" or "Revised" License
253 stars 55 forks source link

asap-7 dual port SRAM #644

Closed Leon924 closed 1 year ago

Leon924 commented 2 years ago

Hi:

I am using asap-7 behavior model provided inside hammer. I was wondering whether both ports of SRAM2RW* instance will be used to write and read, because I want to substitute asap7-sram with other SRAMs of another technology. If one of them be only used to write and another used to read. Then 2prf (2 port register file) is suitable for reg-to-sram substitution. am I right ? or must I use DPSRAM whichever technology I choose ? or let's say,will configuration “rw, rw” / "mrw, rw" show up in top.mem.conf ? for now, I just see "write, read", "mwrite, read" cases

best, Leon

harrisonliew commented 2 years ago

The SRAM2RW models allow a read or write operation simultaneously on both ports (but of course, you should not have the same address on both ports). Yes, 2prf would be suitable.

For the .mem.conf spec for the ports, see this: https://github.com/ucb-bar/barstools/blob/314d80729e7b32e3c10c0da6734bbdc9a867916f/src/main/scala/mdf/macrolib/ConfReader.scala#L63-L66. The other situation you are asking would be a 1r1w port. The m denotes a masked write port.

Leon924 commented 2 years ago

The SRAM2RW models allow a read or write operation simultaneously on both ports (but of course, you should not have the same address on both ports).

I substitube asap7-srams with T28 SRAMs macro. But the rv64-p-simple test occurs "A-write and B-read ,same-addr" error (same as the condition you mentioned to must avoid). And then I double check the signals in asap-7 sram simulation case. see the picture below, I think it also have this kinda behavior, am I right?

  1. binary: rv64ui-p-simple
  2. signal path: TestDriver.testHarness.chiptop.system.tile_prci_domain.tile_reset_domain.boom_tile.frontend.bpd.banked_predictors_0.components_2.meta_0.meta_0_ext.mem_0_0.*
  3. TimeStamp: 789.5ns

1650376444(1)

jerryz123 commented 2 years ago

Avoiding the potential RAW on the same cycle is the responsibility of the microarchitecture. Using the Chisel 1R1W mem model, while the microarchitecture is allowed to issue a read and write to the same address on the same cycle, it must also be able to handle the undefined read data in this case.

In the example shown, the branch predictor is a case where RAW hazards doesn't matter, as the pipeline will do the right thing no matter what is read

Leon924 commented 2 years ago

Hi:

Thanks for the explanation. I have fixed this problem here.

Then I want to leverage hammer-synopsys-plugin to do synthesis with Design Compiler tool. when rewriting t28 tech plugin nested in $HAMMER_HOME/hammer/src/hammer-vlsi/technology, mock with asap7 case and schema.json , I revise the asap7.tech.json to t28.tech.json. But __init__.py line 127 -line 134 failed to get libraries. I thought there are some mismatch. Should I revise my tech json?

syn.log

t28.tech.json.txt

Best, Leon

Edit:and should I add all sram libraries description to tech.json? Edit: by the way, Could you please let me know whether hammer support power-syn now ( post-syn power analysis) ?

harrisonliew commented 2 years ago

A couple points:

  1. The DC plugin is very much orphaned. We haven't had any maintainers for the DC plugin for about 5 years, so its functionality may be very limited; use at your own risk.
  2. You should be using schema.json to build your own .tech.json file. However, keep in mind that ASAP7 is meant for use with Cadence tools, not Synopsys tools. Therefore, if you want to use DC, you must supply the Synopsys-specific files/dirs such as milkyway lib in dir, tluplus files, and tluplus map file in addition to what you have now. Also, it looks like from the syn log that there is an unfilled string - where is that from?
  3. In general, no, we only compile the SRAM library descriptions on a macro-by-macro basis (this speeds up loading time for very large SRAM libraries), but, you could put them in your .tech.json file if you wish, since they are effectively just another library.
  4. We are very close to opening a PR with power-syn using Cadence Joules, but not with Synopsys.
Leon924 commented 2 years ago

The DC plugin is very much orphaned. We haven't had any maintainers for the DC plugin for about 5 years, so its functionality may be very limited; use at your own risk.

Huh, what a pity! honestly, we really rely on T28 tech and synopsys dc+icc toolchain to realize physical design. So I will still try to deubg with it.

you must supply the Synopsys-specific files/dirs such as milkyway lib in dir, tluplus files, and tluplus map file in addition to what you have now. Also, it looks like from the syn log that there is an unfilled string - where is that from?

I have tried to excute it in debug mode to check all the parameter inputs to run_synthesis bash file.

Another question is : where does make syn process take .tech.json as input? I check the hammer.d file, only see it depends on sram_generator-output.json as dependence. Could I think all libs in .tech.json as global library shared by all designs?

We are very close to opening a PR with power-syn using Cadence Joules, but not with Synopsys

So Great job! could you please tell me when? Maybe in a month? Excited about it!

harrisonliew commented 2 years ago

Huh, what a pity! honestly, we really rely on T28 tech and synopsys dc+icc toolchain to realize physical design. So I will still try to deubg with it.

We had a contributor update the ICC (legacy) plugin recently, but not DC. It's possible that he found the DC plugin to be sufficient? https://github.com/ucb-bar/hammer-synopsys-plugins/commits/master?author=harikumar27399

where does make syn process take .tech.json as input?

The .tech.json file is supposed to be part of the technology plugin that you have made for T28. I think you've used the documentation to build it: http://docs.hammer-eda.org/en/latest/Technology/Tech-json.html. As long as you've selected the technology library properly, the .tech.json is implicitly loaded, no command-line necessary. It is shared globally by all designs that use the tech plugin.

Maybe in a month?

Probably less!

Leon924 commented 2 years ago

We had a contributor update the ICC (legacy) plugin recently, but not DC. It's possible that he found the DC plugin to be sufficient?

Thanks, I will checkout there.

the .tech.json is implicitly loaded, no command-line necessary. It is shared globally by all designs that use the tech plugin.

Got it !

And I wanna ask help for revising constraints file used for systhesis. Because I notice that ChipTop module have serial tl clock and axi4_mem_0_clock ouput signal , after track them,see that generated from ClockDivider.sv. In .top.v, I check instance of module ClockDividerN. only one instance have DIV=1. In this case, should I add create_generate_clock clause to indicate this clock as clock constraints? 1650860576(1) 1650860979(1)

harrisonliew commented 2 years ago

If you look at the clock constraint, you can specify generated: true to denote that it's a generated clock as well as the source_path and default/expected divisor.

Leon924 commented 2 years ago

Hi:

After synthesis, I wanna give multiple benchmarks to sim parallely, could I assign mutiple binary inputs to the command by like this? make sim-syn CONFIG==TinyRocketConfig BINARY=bm1,bm2,bm3? Or kindly ask you what's the standard way to do this? does hammer support this behavior?

best, leon

harrisonliew commented 2 years ago

Unfortunately, Chipyard's Makefiles do not support that at the moment.

However, Hammer does. If you look at this line in the vlsi Makefile: https://github.com/ucb-bar/chipyard/blob/main/vlsi/Makefile#L163, if you can modify this to put a list of strings, one for each binary, then Hammer sim does support running them in parallel with the sim.inputs.parallel_runs key. Here is the source code for that: https://github.com/ucb-bar/hammer-synopsys-plugins/blob/696589f3fec890722539653a9b259dcc860c1f7d/sim/vcs/__init__.py#L325-L346

Leon924 commented 2 years ago

if you can modify this to put a list of strings, one for each binary, then Hammer sim does support running them in parallel with the sim.inputs.parallel_runs key

Thanks a lot. I have revise Makefile here then it can accept mutiple benchmarks input.

and after estimating post-synthesis power using two benchmarks simulation outputs saif. I found the average power of this two are so close. I was wondering that the "time" saif mode might affect this(0.0ns - 100,000,000ns), there is only a small section of activity inside complete waveform. after checking vcs tool plugin, I found there are three saif are provided there: time, trigger(unsupported), trigger raw. In trigger-raw mode, we need to provide start_trigger_raw amd end_trigger_raw. I 've never used this mode to do vcs simulation before. Could I dump the most active section of the waveform by this way? if possible, could you please give me an example of start_trigger_raw amd end_trigger_raw scripts?

harrisonliew commented 2 years ago

The trigger_raw mode is intended for you to set your own power dumping start/end triggers, and it is "raw" because you will need to write the exact UCLI commands into the keys. According to the UCLI user guide, I think you would need to use the stop command to insert breakpoints that triggers on something in your design (an event, time, line, or task), . I do not have an example, but I think this is what the relevant portion of the generated UCLI should look like:

stop <options>                      <-- this breakpoint is what you specify for start_trigger_raw
run                                 <-- this runs until the start breakpoint    
power -gate_level on
power <dut>
stop <options>                      <-- this breakpoint is what you specify for end_trigger_raw
run                                 <-- this runs until the end breakpoint
power -report ucli.saif 1e-9 <dut>
run                                 <-- this runs the simulation until the end
Leon924 commented 2 years ago

Thank you so much! I will try to figure that out.

Leon924 commented 2 years ago

Hi:

when I follow the hammer flow stage by stage, run the below command: make buildfile CONFIG=SmallBoomConfig make sim-rtl-debug CONFIG=SmallBoomConfig BINARY=xxx(this stage will trigger rtl rebuild) make syn CONFIG=SmallBoomConfig (no rebuild rtl) make sim-syn-debug CONFIG=SmallBoomConfig BINARY=xxx (this stage will also trigger rtl rebuild,and retrigger synthesis stage )

makefile L248-L249 under vlsi folder tell me to set HAMMER_D_DEPS to empty string will avoid rebuild rtl, but it didn't work after I tested it, could you please give me some tips how can I avoid this rtl rebuild behavior?

best, leon

harrisonliew commented 2 years ago

I have not seen this behavior before, as long as none of the input configuration (yml files) are modified between each step in the flow. Can you try setting BINARY for every step, not just the simulation steps? Additionally, there are redo- targets (see sim.mk) that break all dependencies in the generated Makefile.