OpenXiangShan / NEMU

Other
236 stars 90 forks source link

fix(mmu): do not truncate high addr bits when ifetch exceptions occur #439

Closed Tang-Haojin closed 1 month ago

Tang-Haojin commented 1 month ago

Handling of memory access exceptions of invalid high address bits will be done later.

poemonsense commented 1 month ago

Handling of memory access exceptions of invalid high address bits will be done later.

Comment here to clarify the behaviors.

The handling of out-of-range memory accesses includes two steps, i.e., detecting and reporting.

Full address is required for detecting. As you said, invalid high address must be detected as an access-fault exception.

However, for reporting, it is legal to convert some addresses to other legal values when updating xtval, resulting in different xtval and actual virtual address values. This is defined in the ISA which allows the xtval registers to hold only legal values, where the definition of legal is implementation-defined.

Thus, using a 39-bit xtval register is legal and writing only 39 bits (and dropping the upper bits) is also legal.

It is also legal to remove this 39-bit feature. However, I doubt the necessity for XiangShan to use 64-bit xtval.

Tang-Haojin commented 1 month ago

Handling of memory access exceptions of invalid high address bits will be done later.

Comment here to clarify the behaviors.

The handling of out-of-range memory accesses includes two steps, i.e., detecting and reporting.

Full address is required for detecting. As you said, invalid high address must be detected as an access-fault exception.

However, for reporting, it is legal to convert some addresses to other legal values when updating xtval, resulting in different xtval and actual virtual address values. This is defined in the ISA which allows the xtval registers to hold only legal values, where the definition of legal is implementation-defined.

Thus, using a 39-bit xtval register is legal and writing only 39 bits (and dropping the upper bits) is also legal.

It is also legal to remove this 39-bit feature. However, I doubt the necessity for XiangShan to use 64-bit xtval.

Simply truncating is not valid. An invalid address can be converted to another invalid address only.

It is a bit confusing when we switch between Sv39 and Sv48 if we only store the valid bits. For simplicity, we just store the full address. This will not cost much more resources by means of upcoming XS implementation.

poemonsense commented 1 month ago

Simply truncating is not valid. An invalid address can be converted to another invalid address only.

This is true. However, "legal" includes both valid and invalid address. The "legal" is defined by the CSR, while the "valid" depends on the memory attributes.

There is no need to maintain the 64-bit address because this requires a 64-bit datapath in both frontend and lsu to maintain any in-flight address before all related exceptions are resolved and cleared.

You may refer to the rocket-chip to learn how we can avoid the requirement of 64-bit and use less bits for xtval.

Tang-Haojin commented 1 month ago

Simply truncating is not valid. An invalid address can be converted to another invalid address only.

This is true. However, "legal" includes both valid and invalid address. The "legal" is defined by the CSR, while the "valid" depends on the memory attributes.

There is no need to maintain the 64-bit address because this requires a 64-bit datapath in both frontend and lsu to maintain any in-flight address before all related exceptions are resolved and cleared.

You may refer to the rocket-chip to learn how we can avoid the requirement of 64-bit and use less bits for xtval.

Thank! I still doubt if the implementation of RC is legal.

In riscv-privileged it says

It need not be capable of holding all possible invalid addresses. Prior to writing mtval, implementations may convert an invalid address into some other invalid address that mtval is capable of holding.

But in RC, it convert invalid addresses to 0.U, which is definitely a valid address in Sv39, Sv48, Sv39x4, Sv48x4, Bare, etc.

csr.io.tval := Mux(tval_valid, encodeVirtualAddress(wb_reg_wdata, wb_reg_wdata), 0.U)

Actually in the original design in XS, when a jump with invalid target is detected in XS, we tell frontend to raise an exception and in backend, we write (1 << 63).U to xtval / xepc.

The reason why we decide to store the complete value of xtval is that we found that we only need few costs to accomplish this. We maintained a single 64-bit register to store the oldest invalid jump target and when the exception is raised, we can use the value to obtain the full invalid address.

I think storing complete address is better if we can accomplish it with low or nearly no cost. First, "valid address" is ambiguous because a valid address in Sv48 or Sv48x4 may be invalid in Sv39. Behavior in such cases is not straightforward. Second, the behavior of mtval2 is different with mtval, which may bring verification complexity and confusion. in riscv-privileged it says:

mtval2 is a WARL register that must be able to hold zero and may be capable of holding only an arbitrary subset of other 2-bit-shifted guest physical addresses, if any.

That means, simply truncating the bad address is legal in mtval2, while we should not convert an invalid address to another address like what xtvals do.

poemonsense commented 1 month ago

But in RC, it convert invalid addresses to 0.U, which is definitely a valid address in Sv39, Sv48, Sv39x4, Sv48x4, Bare, etc.

csr.io.tval := Mux(tval_valid, encodeVirtualAddress(wb_reg_wdata, wb_reg_wdata), 0.U)

It does not convert invalid address to 0.U. The tval_valid means there should be one tval (not for valid/invalid addresses). Instead, the encodeVirtualAddress function converts the invalid addres by some rules. I don't remember the exact output but you may read the detailed function body for details. Maybe something similar as {invalid_bit, lower 39 bits}.

The reason why we decide to store the complete value of xtval is that we found that we only need few costs to accomplish this. We maintained a single 64-bit register to store the oldest invalid jump target and when the exception is raised, we can use the value to obtain the full invalid address.

Correct. Jump targets can be classified as valid/invalid easily. I remembered I told YuanGod the similar approach. However, LSU may not be exactly the same because the exception cannot be resolved until the memory request finishes. To maintain 64-bit full addresses, any inflight memory request should be remembered. For example, for L1 cache misses, given the access-fault may occur after AXI responses, do you need to maintain the 64-bit addresses in load queues? I don't know the details of KMH, but I think these cases worth thinking.

I think storing complete address is better if we can accomplish it with low or nearly no cost.

Agree if it can be accomplished with a low overhead.

Tang-Haojin commented 1 month ago

The reason why we decide to store the complete value of xtval is that we found that we only need few costs to accomplish this. We maintained a single 64-bit register to store the oldest invalid jump target and when the exception is raised, we can use the value to obtain the full invalid address.

Correct. Jump targets can be classified as valid/invalid easily. I remembered I told YuanGod the similar approach. However, LSU may not be exactly the same because the exception cannot be resolved until the memory request finishes. To maintain 64-bit full addresses, any inflight memory request should be remembered. For example, for L1 cache misses, given the access-fault may occur after AXI responses, do you need to maintain the 64-bit addresses in load queues? I don't know the details of KMH, but I think these cases worth thinking.

YuanGod told me that now Memblock maintains a single-entry exception queue, which means it can also store the invalid address with a low cost. He will be in charge of memory-access part of this issue.

poemonsense commented 1 month ago

YuanGod told me that now Memblock maintains a single-entry exception queue, which means it can also store the invalid address with a low cost. He will be in charge of memory-access part of this issue.

This was true. However, recently I heard KMH can handle AXI bus errors. This will delay the exception resolve to AXI responses. I doubt a single-entry is enough.