Closed mndstrmr closed 4 months ago
Acknowledged - this is a known weakness in RTL and I semi-intentionally deferred it till we have a good way to verify. Since we now have the ability to check I will try to fix that in the next week and we can revisit.
I have finally got the chance to clean things up on the exception priority side. Please see the latest commit (f3570e5). If we can start verifying RTL against sail it would be great, thanks.
Started verifying the fixes for memory instructions. Looks largely OK, but I've run into a couple of issues:
vio_cause_enc
the bound_vio
check should be last instead of first.E_Fetch_Addr_Align
exceptions, instead of EXC_CHERI
, with priority lower even than bounds checks.Thanks - will check in the fix soon.
From: Louis-Emile Ploix @.> Sent: Saturday, January 20, 2024 9:20 AM To: microsoft/cheriot-ibex @.> Cc: Comment @.>; Subscribed @.> Subject: Re: [microsoft/cheriot-ibex] Exception priorities (Issue #11)
Started verifying the fixes for memory instructions. Looks largely OK, but I've run into a couple of issues:
- Reply to this email directly, view it on GitHubhttps://github.com/microsoft/cheriot-ibex/issues/11#issuecomment-1902181245 or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3V7IMBBSRTB6JLQNZYNVFTYPP4FBBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLLDTOVRGUZLDORPXI6LQMWWES43TOVSUG33NNVSW45FGORXXA2LDOOJIFJDUPFYGLKTSMVYG643JORXXE6NFOZQWY5LFVE2TSMBRGM3DKOBZQKSHI6LQMWSWS43TOVS2K5TBNR2WLKRRHEYTGMRYGU3TSMFHORZGSZ3HMVZKMY3SMVQXIZI. You are receiving this email because you commented on the thread.
Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Checked in the fix (commit [5c37f9a]). Please verify, thanks.
Fix looks good, there's just one more problem, then I can am happy to close this issue:
In #12 I suggested that the is_cap
case needed top_equal
during bounds checking. This does lead to correct code in the sense that illegal memory loads are prevented and legal ones aren't. What I had not realised however is that it does lead to incorrect exception priorities (my apologies about that - I think you alluded to this in your response). In particular, in the Sail a bounds check failing has higher priority than a misalignment error, i.e. bounds checking should happen on the potentially misaligned address, not the aligned address, so that in the event that both errors occur simultaneously we get the bounds check error instead of the alignment error.
From line 838 in cheri_ex.sv
, the below fix would close this issue:
if (cheri_operator_i[CIS_SUBSET])
top_chkaddr = rf_fullcap_b.top33;
else if (cheri_operator_i[CJAL] | cheri_operator_i[CJALR])
top_chkaddr = {base_chkaddr[31:1], 1'b0};
else if (is_cap) // CLC/CSC
top_chkaddr = base_chkaddr + 33'd8; // Don't assume alignment
else
top_chkaddr = base_chkaddr;
if (cheri_operator_i[CSEAL] | cheri_operator_i[CUNSEAL]) begin
top_bound = rf_fullcap_b.top33;
base_bound = rf_fullcap_b.base32;
end else if (cheri_operator_i[CJAL]) begin
top_bound = {pcc_cap_i.top33[32:1], 1'b0};
base_bound = pcc_cap_i.base32;
end else if (cheri_operator_i[CJALR]) begin
top_bound = {rf_fullcap_a.top33[32:1], 1'b0};
base_bound = rf_fullcap_a.base32;
end else begin // Includes CLC/CSC
top_bound = rf_fullcap_a.top33;
base_bound = rf_fullcap_a.base32;
end
top_vio = (top_chkaddr > top_bound);
base_vio = (base_chkaddr < base_bound);
top_equal = (top_chkaddr == top_bound);
if (debug_mode_i)
addr_bound_vio = 1'b0;
else if (is_cap | cheri_operator_i[CIS_SUBSET])
addr_bound_vio = top_vio | base_vio;
else if (cheri_operator_i[CJAL] | cheri_operator_i[CJALR])
addr_bound_vio = top_vio | base_vio | top_equal;
else if (cheri_operator_i[CSEAL] | cheri_operator_i[CUNSEAL])
addr_bound_vio = top_vio | base_vio | top_equal;
@rmn30 , is there a particular reason the address bound violation has to take priority over alignment errors? This check is in the critical timing path for the load/store interface (since it has to gate the outgoing memory request), and I'd like to avoid additional adders..
I would expect alignment errors to be higher priority because you may wish to trap and emulate on alignment failures.
@mndstrmr , the new RTL commit (ad90fc7) added separate logic (addr_bound_vio_ext) to address mcause/mtval encoding for this specific case. Please let me know if it verifies. thanks.
This looks good to me. There are a few more issues with exceptions that I have since found:
base[8] = 1
and exponent 24 (a combination which is reachable by doing a CSetHigh
, for instance) then when an exception is thrown MTCC is decompressed to a pcc_cap_t
, and the top bit of the base is lost. This is fine, but if the PCC is later stored then the compressed bounds will have base[8] = 0
which is incorrect according to the Sail which always stores the compressed form. My fix is to add an additional 1 bit field to pcc_cap_t
which stores what base[8]
would be. This is written when doing a full2pcap
and retrieved when doing a pcc2fullcap
.is_cap_sentry
(used by PVIO_SEAL
for CJALR
) operates on the compressed otype, but should operate on the decompressed otype.StoreCapImm
runs, ls_addr_misaligned_only
is set when perm_vio_vec[PVIO_ALIGN] && (cheri_err_cause == 0)
. If perm_vio[PVIO_SLC]
is set (whether in debug mode or not) then the cheri_err_cause
will be nonzero, and the exception will reported as a store local error instead of an alignment error, even in debug mode.Thanks, and apologies for the issue dump.
Thanks, all good findings.
plat_mtval_has_illegal_inst_bits
) which I was assuming was 1
because in your regular handling of illegal instructions (ibex_controller.sv
line 735
) you do set MTVAL to the instruction that was illegal. So I guess it's up to you which one you want, but they should be consistent with one another. It would be easier to set csr_mtval_o = 0
in ibex_controller.sv
line 737
.CGetHigh
instruction will still see a difference between the Sail and the RTL.perm_vio_vec[PVIO_ALIGN]
is set, but then the exception type is not reported as an alignment exception because perm_vio[PVIO_SLC]
is set.I also have one more issue with CSpecialRW: If an illegal scr address is accessed while the PCC also doesn't have access_system_regs
, currently the ASR exception is thrown instead of the illegal instruction exception. The Sail says it should be the other way around. The following fixes cheri_ex.sv
line 1015
:
if (cheri_operator_i[CCSR_RW] & cheri_wb_err_raw & illegal_scr_addr & cheri_exec_id_i)
// cspecialrw trap, illegal addr, treated as illegal_insn
cheri_wb_err_info_d = {3'h0, 1'b1, 12'h0};
else if (cheri_operator_i[CCSR_RW] & cheri_wb_err_raw & perm_vio & cheri_exec_id_i)
// cspecialrw traps, PERM_SR
cheri_wb_err_info_d = {5'h0, 1'b1, cheri_cs2_dec_i, cheri_err_cause};
Regarding #1, I think in our case we actually have bool rv_mtval_has_illegal_inst_bits = false; In RTL (ibex_controller.sv), it's actually handled at line 810 (basically I chose to decode illegal_scr adddress in cheri_ex rather than decoder, in an attempt to reduce fan-in to illegal_insn which goes to too many places). cheri_wb_err_prio: begin if (cheri_pmode_i) begin if (cheri_wb_err_info_i[12]) begin // illegal SCR addr exc_cause_o = EXC_CAUSE_ILLEGAL_INSN; csr_mtval_o = {21'h0, cheri_wb_err_info_i[10:0]}; end else begin exc_cause_o = EXC_CAUSE_CHERI_FAULT; csr_mtval_o = {21'h0, cheri_wb_err_info_i[10:0]}; end end end So I think RTL is actually matching sail..
I will to take a look at #4 and #5 later, it make take a little while since I'll be out of office for a few days.
mtval_has_illegal_inst_bits = 0
is fine, but in that case these lines also need to be updated to reflect that:
Both that and the code at line 810 refer to the same Sail handle_illegal
function, meaning they both need to do the same thing.
If the code at line 810 is to remain the same, the code at line 735 should look like the below:
illegal_insn_prio: begin
exc_cause_o = EXC_CAUSE_ILLEGAL_INSN;
csr_mtval_o = 0;
end
I see what you mean - yes we should make those 2 cases consistent. Likely will make line 737 to set mtval = 0 if pmode = 1. @rmn30, can you confirm we indeed what to go with mtval_has_illegal_inst_bits = 0?
@mndstrmr I have checked in RTL changes (30cac29) which hopefully fixes issue #1, #3, #4 and #5. As for #2, I have confirmed with @rmn30 that the base[8]=1, exp=24 is a don't care case (not reachable). Can we change the constraints to reflect that? Thanks.
Yes, it should not be possible to create a tagged capability with base[8]=1 and exp=24 (i.e. base=2**32, which is greater than the 32-bit base used in csetbounds). If an untagged capability with such base is installed in MTCC when an exception / interrupt is taken this will result in an unrecoverable exception loop, so we don't really care about this case.
This is all great, thanks. Easiest update I've made in a while.
As pointed out to us before the exception priorities of CJAL, CJALR, memory instructions and instruction fetch errors are incorrect. I don't know if it is worth me going into too much detail here about each small issue here as there are many of them, most of which are best explained by simply comparing the SV and the Sail.
That said I am more happy to check any fixes as our verification setup does now check CSR writes (including for exceptions).