riscv-non-isa / riscv-arch-test

https://jira.riscv.org/browse/RVG-141?src=confmacro
Apache License 2.0
509 stars 198 forks source link

running riscof on RTL model - signatures mismatches #312

Closed algrobman closed 1 year ago

algrobman commented 1 year ago

Hi,

I've finally managed to run compliance tests with riscof on our RTL model and got 16 passes and 74 fails.

diff for caddi-01 test has two extra zero values at the end of seil reference signature. I use rvtest_sig_begin and rvtest_sig_end labels values, to dump verilog memories at the end of simulation. I extract the labels values with nm utility from my.elf

%diff riscof_work/rv32i_m/C/src/caddi-01.S/dut/DUT.signature  riscof_work/rv32i_m/C/src/caddi-01.S/ref/Reference-sail_c_simulator.signature 
378a379,380
> 00000000
> 00000000

what do I miss?

HTML report for this test looks like:

commit_id:b91f98f3a0e908bad4680c2e3901fbc24b63a563
MACROS:
TEST_CASE_1=True
XLEN=32
File1 Path: riscof_work/rv32i_m/C/src/caddi-01.S/dut/DUT-s200.signature
File2 Path: riscof_work/rv32i_m/C/src/caddi-01.S/ref/Reference-sail_c_simulator.signature
Match  Line#    File1    File2

*also got real mismatches for misalign tests because our CPU supports misaligned data accesses
what can I do about these tests?**

allenjbaum commented 1 year ago

Misaligned tests are really, really nasty. Architecturally, misaligned loads&stores are allowed to be nondeterministic. Implementations are allowed to sometimes support, and sometimes not support them, even for the same addresses at the same instruction (e.g. in a loop). There is a way to tell Sail that misaligneds are supported; currently you have to rebuild the model with the right command line switch to do it, as I recall (that will change sometime in the not too distant future.) I don't know why the signature lengths are different, though, especially for the "add" tests. Are you using the latest version of arch-test.h? (from last week)

algrobman commented 1 year ago

I ran riscof setup this week. Does commit_id:b91f98f3a0e908bad4680c2e3901fbc24b63a563 say you something? Is there way not to run these misaligned tests?

Talking about CPU features - looks like riscof creates pretty detailed yaml description of the CPU from config.ini as I guess. But it does not match what our CPU has. For example, we are only M-mode CPU, but the yaml has mstatus bits found for MSU-mode CPUs. How do we tweak all of these features?

algrobman commented 1 year ago

Was my assumption correct that I need to dump addresses between rvtest_sig_begin and rvtest_sig_end into the signature file?

allenjbaum commented 1 year ago

Yes, that is correct. But if you get a mismatch in size between what Sail thinks the length is, and your DUT, you might see that. With the new arch-test, we've added Canary words to see if anything is getting overwritten. Try using that (and the new test-macros.h that was split off from it) and see if you're seeing the same results. Check that the RV_MODEL_DATA_BEGIN/END macros supplied don't try to align to a 16B boundary or something.

On Fri, Feb 10, 2023 at 4:48 PM algrobman @.***> wrote:

Was my assumption correct that I need to dump addresses between rvtest_sig_begin and rvtest_sig_end into the signature file?

— Reply to this email directly, view it on GitHub https://github.com/riscv-non-isa/riscv-arch-test/issues/312#issuecomment-1426538504, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJREV6PAIMM3WOHKL3DWW3OWPANCNFSM6AAAAAAUYKI6D4 . You are receiving this because you commented.Message ID: @.***>

algrobman commented 1 year ago

ref.disass end looks like:

800046e8:   deadbeef            jal t4,7ffdfcd2 <absimm+0x7ffd47d0>
800046ec:   deadbeef            jal t4,7ffdfcd6 <absimm+0x7ffd47d4>

800046f0 <sig_end_canary>:
800046f0:   a309                    j   80004bf2 <_end+0x4f2>
800046f2:   6f5c                    .2byte  0x6f5c

800046f4 <rvtest_sig_end>:

our simulation prints out last address 800046f4 ...

last lines in dut.signature:

ffff4b08
ffff4b04
6f5ca309
00000000

last lines in ref.signature:

ffff4b08
ffff4b04
6f5ca309
00000000
00000000
00000000

What should we do? Please, advice . PS. We compiled reference model on RedHat7 with tricks. Does your riscof/sail stuff work on RedHat Linux?

jrtc27 commented 1 year ago

Your problem is you're looking at the wrong end symbol. Consider:

https://github.com/riscv-software-src/riscof/blob/c2a134614f8b6201ac24488a0b9c2ea15ba91168/riscof/Templates/setup/sail_cSim/env/model_test.h#L30-L31

This forces end_signature, the thing that the Sail model looks at, to be 16-byte (2^4) aligned, since .align is .p2align not .balign, whereas the thing you're looking at, rvtest_sig_end, is a label that consistently comes before the RVMODEL_DATA_END macro use:

https://github.com/riscv-non-isa/riscv-arch-test/blob/933ddddff896453dec83b6ca1ca2453b6f584d0b/riscv-test-suite/rv32i_m/C/src/caddi-01.S#L1963-L1964

Thus unless rvtest_sig_end already happens to be 16-byte aligned it will differ from end_signature. You should presumably be using the latter. There's a reason the rvtestsig* symbols aren't marked as .glob(a)l whilst the begin/end_signature ones are.

TL;DR: Use begin/end_signature not rvtest_sig_begin/end, though this does seem like a bit of a mess on the riscv-arch-test/riscof side and seems to be undocumented. The Sail model is just doing exactly what it's been told to do by riscof, though.

algrobman commented 1 year ago

Thanks all , the wrong labels names was the reason for mismatch. Correct labels names to use with signature dumper are: begin_signature end_signature

the address range to print is begin_signature <= address < end_signature

allenjbaum commented 1 year ago

Hmm, I thought we removed the 16B alignment restriction and made it 4B aligned - and we haven't been having issues with that, so not quite sure where the mismatch is. The the riscof Sail plugin not configured correctly? Test are now using rvtest_sig_begin_end labels, and will generate signatures that fit into those, regardless of their 16B alignment. If that's broken somewhere, file a specific issue that says where that's not working so we can get it changed.

On Sat, Feb 11, 2023 at 6:09 PM algrobman @.***> wrote:

Thanks all , the wrong labels names was the reason for mismatch. Correct labels names to use with signature dumper are: begin_signature end_signature

the address range to print is begin_signature <= address < end_signature

— Reply to this email directly, view it on GitHub https://github.com/riscv-non-isa/riscv-arch-test/issues/312#issuecomment-1426922439, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJX472NKYD6UL3KJFT3WXBA4DANCNFSM6AAAAAAUYKI6D4 . You are receiving this because you commented.Message ID: @.***>

allenjbaum commented 1 year ago

IF the fix is merely to make the rvtest_sig_begin/end label to be global, that is easily fixed.

jrtc27 commented 1 year ago

No, the problem is that rvtest_sig_begin/end and begin/end_signature are not the same thing, they cover different ranges, and both signature producers (in this case, Sail and the DUT) need to agree on which to use otherwise different ranges get dumped. Sail is using the latter, the DUT was using the former.

But yes, if rvtest_sig_begin/end are the ones to use then they should be global, otherwise things like ld -x will delete them. And then I'd ask why both need to exist; having multiple things that mean different things but have names that sound like they're the same thing is actively unhelpful.

pawks commented 1 year ago

No, the problem is that rvtest_sig_begin/end and begin/end_signature are not the same thing, they cover different ranges, and both signature producers (in this case, Sail and the DUT) need to agree on which to use otherwise different ranges get dumped. Sail is using the latter, the DUT was using the former.

The .align after the rvtest_sig_end has been removed in the latest sail-plugin here. In both the dut and reference the signature region is the region between the addresses equivalent to rvtest_sig_begin/end. I say equivalent because it is legal for the models to define their own labels. In this case the problem was that there was a .align after the rvtest_sig_end in the Model macros(possibly due to an obsolete header file).

But yes, if rvtest_sig_begin/end are the ones to use then they should be global, otherwise things like ld -x will delete them.

The label names can be anything. If the DUT indeed uses them, declaring them as global in the RVMODEL_DATA_BEGIN macro will achieve the intended result.

And then I'd ask why both need to exist; having multiple things that mean different things but have names that sound like they're the same thing is actively unhelpful.

The rvtest_data_begin/end were added because the trap handler needed to identify the starting and ending of the signature region. The begin/end_signature is entirely model specific.

jrtc27 commented 1 year ago

No, the problem is that rvtest_sig_begin/end and begin/end_signature are not the same thing, they cover different ranges, and both signature producers (in this case, Sail and the DUT) need to agree on which to use otherwise different ranges get dumped. Sail is using the latter, the DUT was using the former.

The .align after the rvtest_sig_end has been removed in the latest sail-plugin here.

Then what's https://gitlab.com/incoresemi/riscof-plugins/-/blob/master/sail_cSim/env/model_test.h#L31? That comes between rvtest_sig_end and end_sigature.

pawks commented 1 year ago

I thought a previous PR had fixed it, but looks like it addressed only the spike plugin. It shouldn't be there. PR raised to fix it.

algrobman commented 1 year ago

Guys, please, make sure that users knows how to print the test signature for a DUT. It was not clear for me, what ELF symbols to use.

allenjbaum commented 1 year ago

That will be in the WIP TestUsersSpec document at the latest. The issue is that the existing spec only suggests a label name, but doesn't mandate it, and the trap handler needed a lot of augmentation to be able to deal with privilege mode transitions, so needed specific labels, and we chose something with an rvtest prefix. All tests have those labels as global, but documentation hasn't caught up.

On Tue, Feb 14, 2023 at 9:30 AM algrobman @.***> wrote:

Guys, please, make sure that users knows how to print the test signature for a DUT. It was not clear for me, what ELF symbols to use.

— Reply to this email directly, view it on GitHub https://github.com/riscv-non-isa/riscv-arch-test/issues/312#issuecomment-1430118579, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJWALENTZQPKUX62AITWXO6JTANCNFSM6AAAAAAUYKI6D4 . You are receiving this because you commented.Message ID: @.***>