Open hcallahan-lowrisc opened 2 months ago
reserved for updates
@lmg260a (Moving my reply to your comment from 23293 over here...)
Hi, That's all great news! What's the best approach for submitting code changes? Create a branch and then propose a merge, email files, something else?
<...>
So how should I raise those? As separate tickets? I've found that lots of small tickets are easier to track.
Also: should I use "[siemens]" as the ticket-group in the subject line, to make things easier?
I'd like to try and keep things centralized to this issue to start with. If there are common design patterns in the codebase causing problems, we can always spin discussions about them out into individual issues/tickets if it becomes too noisy within this thread. If we do open new issues, probably tagging with [siemens]
or [questa]
and linking back to this one will be best.
I think the best way to submit/suggest code changes would be to post a link to a branch on your github, or to paste the diff into this issue as a comment. I want to integrate all the changes into the working PR #24331 so I can regress it against all the tools I do have access to in one go before merging, so raising individual pull requests for changes is probably going to be quite a lot of noise at this point.
I've also found things which might be exactly what's wanted, but if not clearly documented it would be really easy for people to break things. For example, both a parent and child classes have constraints named the same. Per LRM, the child constraint overwrites the constraint in the parent. My problems are (1) I can't tell if that's deliberate or a bug. It should be clearly documented if it IS deliberate. (2) it may be that whatever simulator you're using doesn't interpret the LRM the same way I do, and it could be that (2.a) I'm wrong or (2.b) the LRM is subtlely vague - and so we're both wrong (or at least, neither of us can claim we're right).
Thanks for bringing this up. In the OpenTitan codebase, it's a common pattern to redefine a constraint in a child class intentionally to overwrite the parent class constraint. As you say, that is what the LRM specifies the behaviour should be. Based on your comment (1), is your previous experience that this should always be documented at every instance? Or is a comment about this something that should be incorporated into a project level style-guide? I'm interested in any comments about style generally, and if there is any documentation holes we can fill to clarify things. I don't think I have heard the comment about always commenting overridden class constraints before, but everyone has different industry experiences of course. Any docs we can provide to get everyone on the same page as quickly as possible are valuable!
So one thing I'm debugging is that one of the clock period is being set to zero, then being used to do a divide, which of course results in a divide-by-zero. So there's a race-condition someplace going on.
Can you checkout the working PR #24331, and share the dvsim command you are using to generate the test? Also could you share the variable/error, plus which assignment / randomization you suspect is incorrectly generating 0? I can run it locally and share some logs / values, or suggest where things might be going wrong.
Thanks!
@lmg260a (Moving my reply to your comment here https://github.com/lowRISC/opentitan/pull/24331#issuecomment-2303289278 over to the discussion issue...)
The example dvsim command from your comment (util/dvsim/dvsim.py hw/top_earlgrey/dv/chip_sim_cfg.hjson -i chip_sw_uart_tx_rx -n --tool questa
) won't run anything because -n
is a dry-run switch, which just logs the build commands and exits.
Could you try running without that switch, and let me know what is broken? Running via dvsim is the only supported flow, as it populates all of the defines etc. automatically, and that way we can know everyone is on the same page.
For example, if I checkout #24331 and run dvsim I get the following:
// I don't have questa, so set a dummy variable...
$ export QUESTA_HOME=/home/harry/questa/dummy
$ util/dvsim/dvsim.py hw/top_earlgrey/dv/chip_sim_cfg.hjson -i chip_sw_uart_tx_rx --tool questa
<EVERYTHING FAILS>
// The relevant logfile is at `<opentitan>/scratch/<branch>/chip_earlgrey_asic-sim-questa/default/build.log`, which shows all the commands which were run. Mine looks normal until...
[make]: build
cd /home/harry/projects/opentitan/scratch/dvsim_questa_fixes/chip_earlgrey_asic-sim-questa/default/sim-vcs && /home/harry/dummy/questasim/linux_x86_64/qrun -optimize +define+TOP_LEVEL_DV +define+UVM +define+UVM_NO_DEPRECATED +define+UVM_REGEX_NO_DPI +define+UVM_REG_ADDR_WIDTH=32 +define+UVM_REG_DATA_WIDTH=64 +define+UVM_REG_BYTENABLE_WIDTH=8 +define+SIMULATION +define+DUT_HIER=tb.dut -timescale 1ns/1ps -outdir /home/harry/projects/opentitan/scratch/dvsim_questa_fixes/chip_earlgrey_asic-sim-questa/default/qrun.out -uvm -uvmhome /home/harry/dummy/questasim/verilog_src/uvm-1.2 -mfcu -f /home/harry/projects/opentitan/scratch/dvsim_questa_fixes/chip_earlgrey_asic-sim-questa/default/sim-vcs/lowrisc_dv_chip_sim_0.1.scr -top clkmgr_bind -top pwrmgr_bind -top rstmgr_bind -top sec_cm_prim_onehot_check_bind -top sec_cm_prim_sparse_fsm_flop_bind -top spi_host_bind -top top_earlgrey_error_injection_ifs_bind -top top_earlgrey_bind -top xbar_main_bind -top xbar_peri_bind -top tb -voptargs="+acc=nr"
bash: line 1: /home/harry/dummy/questasim/linux_x86_64/qrun: No such file or directory
Printed with newlines, the invocation was:
cd /home/harry/projects/opentitan/scratch/dvsim_questa_fixes/chip_earlgrey_asic-sim-questa/default/sim-vcs &&
/home/harry/dummy/questasim/linux_x86_64/qrun
-optimize
+define+TOP_LEVEL_DV
+define+UVM
+define+UVM_NO_DEPRECATED
+define+UVM_REGEX_NO_DPI
+define+UVM_REG_ADDR_WIDTH=32
+define+UVM_REG_DATA_WIDTH=64
+define+UVM_REG_BYTENABLE_WIDTH=8
+define+SIMULATION
+define+DUT_HIER=tb.dut
-timescale 1ns/1ps
-outdir /home/harry/projects/opentitan/scratch/dvsim_questa_fixes/chip_earlgrey_asic-sim-questa/default/qrun.out
-uvm
-uvmhome /home/harry/dummy/questasim/verilog_src/uvm-1.2
-mfcu
-f /home/harry/projects/opentitan/scratch/dvsim_questa_fixes/chip_earlgrey_asic-sim-questa/default/sim-vcs/lowrisc_dv_chip_sim_0.1.scr
-top clkmgr_bind
-top pwrmgr_bind
-top rstmgr_bind
-top sec_cm_prim_onehot_check_bind
-top sec_cm_prim_sparse_fsm_flop_bind
-top spi_host_bind
-top top_earlgrey_error_injection_ifs_bind
-top top_earlgrey_bind
-top xbar_main_bind
-top xbar_peri_bind
-top tb
-voptargs="+acc=nr"
This CLI invocation looks okay to me at a pass. Do you see the same failure mode as you described in the other issue when invoking this way, or something else?
Some best practices that would help: 1) 4-state logic was shown to generate both optimistic (false-pass) and pessimistic (false-fail) results in simulation. This was back in the 90s. So it's much better to use 2-state, plus static tools (CDC/RDC) and property checking. Most simulators have the ability to initialize all flops to random 0/1 values at time0 - so you can use that if you are unsure about Xs. Some defensive programming I've learned helps over the years: A) use explicit logic to convert from 4-state to an enum with 4 states, or two 2-state bits; bugs are common when using casting b/c often you want to know that a signal is 'Z' or 'X', and the implicit cast just returns 0. B) use unit-testing (svunit) to verify code before it gets committed. One company reported that when they switched to unit-testing in 12 months they saw a 97% drop in bugs reported in the field. C) Ideally, use test-driven development: the main thing here is that it'll help you spot architectural or interface problems that are just going to be very expensive to verify, and really expensive to debug/maintain. Before you do any coding. D) Best practice in Agile is each developer spends 20% of their time just refactoring code: a couple of hours spent rewriting an interface can save days in verifying a really awkward interface design. E) Put all generated files outside the repo clone - that way, "git status" should only show the files that should be committed. F) I've seen a couple of schools of thought on big-picture repos: one is that you have one huge repo (but then any change means you have to rerun the full regression), or one for RTL, another for sim testbench, another for static/formal scripts. There's arguments for both - I just bring it up in case you find the idea useful. The other is that you have one really small repo per IP (so a 'opentitan_uart' repo, etc.). This way you have all the files involved in an IP in one location, and it greatly simplifies reuse and enhancement. G) For debugging purposes: I've found it really useful to have the verification environment emit a script (including all environment variable settings) that runs the work. This way, if anything goes wrong, you just take that script and attach it to the ticket so it can 100% be reproduced. Note: it really helps if the paths in the script are set using environment variable-names instead of being replaced with their values. I.e. the path to a file should be "$FOO/...", not "/a/b/c/...". With $FOO being set earlier in the script. H) The saying is: 10% documentation, 10% coding, and 80% verification. So really project success is all about rewriting the code to reduce verification cost.
I'd be happy to do any sort of online class or talk that might help your team on this - I think you've got an amazing thing here, and I'm just trying to do what I can to help. The problem with greatness is you can't ever stop improving, or else you just eventually wind up back at "good".
Hi Harry, Some things to change: -uvm -uvmhome /home/harry/dummy/questasim/verilog_src/uvm-1.2
to: -uvm -uvmhome uvm-1.2
This path has "sim-vcs" instead of "sim-questa" (happens multiple places) /home/harry/projects/opentitan/scratch/dvsim_questa_fixes/chip_earlgrey_asic-sim-questa/default/sim-vcs
Questa is stricter to the LRM than VCS, so a series of switches are required to tell Questa to relax rules to be VCS-compatible. This is one of them: -svext=hierbinsof
This shouldn't be required; in any case, +acc is going obsolete in 2025 to be replaced by other switches. -voptargs="+acc=nr"
Please add -permit_unmatched_virtual_intf to all simulation cases.
General questions: 1) why are all those "-top" required? For Questa, there's only one: "-top tb", since one only simulates one module in one sim. 2) why the full path to 'qrun'? /home/harry/dummy/questasim/linux_x86_64/qrun 3) using "-F" makes things more reusable, because all files are searched relative to where the .f file is located. "-f" means all paths are searched relative to where the qrun/vcs is being invoked. so a .f file has to have hardcoded paths instead of relative paths.
Could you please include what you get for running the sim in a) regression mode (don't dump any waveforms) b) batch (dump waveforms for later analysis) c) live (run interactive) d) post-processing (view results generated in batch mode)
Those will need changes as well.
Description
This is a bucket-issue for improving Questa support in OpenTitan.
Motive
I would like to get Questa support into a better place across the project. Questa is a simulator that many people have access to, and a part of maximizing value of the OpenTitan project is to keep the barrier to entry as low as possible for anyone who is interested. As lowRISC currently does not have access to questa licenses, nor is anyone regularly developing on OpenTitan with it or regressing the DV suite using it, the support there has always been limited and liable to break without notice. I hope one day we can test our codebase against more simulators and catch breaking changes before they are merged, but up to now it has been a best-effort approach.
Ctx
Questa support was added initially to our DV test tool 'dvsim' in #10574 - [hw/dv] Feature/questa dv, though I am not sure to what extent it was functional at the time. (i.e. which testbenches could be compiled/simulated, and which were incompatible). Our signoff simulator for blocks/tops is primarily VCS (for example. see the earlgrey top-level testbench config file chip_sim_cfg.hjson#L15), although some blocks now use Xcelium for signoff as well.
'dvsim' is extensible, and the support added above allows for invoking simulations using the questa CLI when you use
--tool=questa
. Fundamentally, dvsim should be able to assemble a valid CLI invocation for any tool it supports, with any number of arguments that are specific to the quirks of any program. Some parts of this process are quite generic between tools. For example, we use fusesoc .core files to package blocks and components, and dvsim will invoke fusesoc to generate output collateral such as a specific filelist for the simulation/testbench of interest. The tool-specific file /hw/dv/tools/dvsim/questa.hjson is then responsible for tying-up the generic parts of the flow with the implementation details of a tool.Existing Questa Tickets
I did a little search of previous issues and pull requests related to questa support. I've pulled out the ones I think might be useful as a reference here...
Issues
https://github.com/lowRISC/opentitan/issues?q=is%3Aissue+questa+
PRs
https://github.com/lowRISC/opentitan/pulls?q=is%3Apr+questa+
Fixing outstanding issues
To get our DV flows working with Questa more broadly, the bulk of the work will be in SV / LRM differences in how tools interpret our codebase. A familiar hurdle for sure!
This process will probably raise a number of issues in dv base classes to begin with, and then follow onto testbench code specific to each block.
Probably the best approach will be to work block-by-block through the individual testbenches, first fixing any build issues, followed by any runtime differences. The following table is a good starting list of some block level testbenches for non-parameterized blocks, and suggested invocations to test them. (I've just worked down the list of IP in
hw/ip
to make a start. This list can become a more comprehensive checklist of tested-blocks in the future.)./util/dvsim/dvsim.py hw/ip/adc_ctrl/dv/adc_ctrl_sim_cfg.hjson -i adc_ctrl_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/aes/dv/aes_masked_sim_cfg.hjson -i aes_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/aon_timer/dv/aon_timer_sim_cfg.hjson -i aon_timer_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/csrng/dv/csrng_sim_cfg.hjson -i csrng_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/edn/dv/edn_sim_cfg.hjson -i edn_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/entropy_src/dv/entropy_src_sim_cfg.hjson -i entropy_src_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/gpio/dv/gpio_sim_cfg.hjson -i gpio_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/hmac/dv/hmac_sim_cfg.hjson -i hmac_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/i2c/dv/i2c_sim_cfg.hjson -i i2c_host_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/keymgr/dv/keymgr_sim_cfg.hjson -i keymgr_smoke --tool questa
./util/dvsim/dvsim.py hw/ip/kmac/dv/kmac_masked_sim_cfg.hjson -i kmac_smoke --tool questa
The top-level testbench would eventually be desirable to fix, though I suspect it will be the most work to get there.
./util/dvsim/dvsim.py hw/top_earlgrey/dv/chip_sim_cfg.hjson -i chip_sw_rv_timer_irq --tool questa
After running smoketests for each block, probably the next thing would be to run 'all_once' regressions in each block. E.g.
./util/dvsim/dvsim.py hw/<ip>/dv/<ip>_sim_cfg.hjson -i all_once --tool questa
This should exercise all of the stimulus vseq's listed in thesim_cfg.hjson
configuration file. Stimulus sequences are located in the following directory for each block-level testbench :hw/<ip>/dv/env/seq_lib/
.Our existing convention has been to avoid
#ifdef
s unless absolutely necessary, so work in this area will need to prioritize finding common language constructs the tools agree on. There are a very small number of#ifdef XCELIUM
or#ifdef VCS
switches in the codebase, and adding more of these should be considered a last-resort. However, I think absorbing changes that use#ifdef
s in the short-term, tracked in issues with pending vendor support feedback, may be workable. For the working branch described below, short-term solutions using#ifdef
s will be acceptable.How you can help
To start with, I've created a draft PR to accumulate a working set of changes to fix bugs as they are found. As bugs and suggested fixes come in, we can discuss them in this issue, and then propagate them over to that PR.
Steps: 1) Setup the opentitan repository & dependencies by following the instructions in steps 1-4 here : https://opentitan.org/book/doc/getting_started/index.html
bazel test --test_output=streamed //sw/device/tests:uart_smoketest_sim_verilator
This will probably take a bit of time to get sorted out, as the feedback loop is inherently manual, and I don't have too many free cycles to put into this. However, I hope that centralizing the discussions here will keep the process moving forwards, and eventually we can get up to a good parity with the other tools.
Thanks!