lowRISC / ibex

Ibex is a small 32 bit RISC-V CPU core, previously known as zero-riscy.
https://www.lowrisc.org
Apache License 2.0
1.37k stars 542 forks source link

Report a bug in Ibex: an interrupt arrives at the same time as a load/store instruction is in ID stage, There are two request for the same load instruction #1339

Closed apollojason closed 3 years ago

apollojason commented 3 years ago

Hi Guys: I use the IBEX core, and take a An interrupt arrives at the same time as a load instruction is in ID stage, then there are another two error load operation happend. I did't config the WriteBackStage, the WriteBackStage parametter is 0. The interrupt is irq_fast_i[12] The interrupt service routine address is 0x3000_0070, the white frame. So you can see there are two extra load operation, the yellow frame. I found the "stall_mem" go to high after 'IRQ_TAKEN'. The instruction in if_stage.instr_rdata_id_o[31:0] don't update.

    Could you help me, how could I fix this bug, just by software or riscv-gcc-complier?

ibex_err

GregAC commented 3 years ago

Hi @apollojason

From your screenshot I can't tell if there's any incorrect behaviour or not. In particular the instr_addr_o, instr_rdata_i, instr_gnt_i, instr_rvalid_i signals relate to what instructions Ibex is prefetching (or bringing into the cache), it isn't the currently executing instruction.

It would be useful to see the pc_id_i signal in ibex_id_stage so you can see which instruction is actually in the ID stage.

It looks like the interrupt is taken after the first load is completed, then the interrupt vector begins with a couple of load instructions.

Could you provide a gzipped VCD file? Screenshots of wave traces are often missing things and it's a lot easier if I can just look at it myself.

apollojason commented 3 years ago

ibex.vcd.tar.gz

Hi GregAC: Nice to meet you! You can call me Jason. I have sent you a vcd file. I hope it can help you.

Best Regards

Jason_Wu

apollojason commented 3 years ago

Hi @GregAC if you find any problems, please leave me a message here.

Best Regards

Jason_Wu

GregAC commented 3 years ago

Which Ibex commit are you using? This may be a bug but it may be one we've found and fixed already.

I've taken a look at the VCD and where your red box is there is an instruction PC: 0x3000_0088 executing which is a load. It loads from address 0x3000_0000 and stalls until the data is returned and written to the register file a couple of cycles later (data read is 0xff9f_f06f).

An IRQ appears when this instruction is executing, Ibex will wait for the load to complete before it handles the IRQ. Ibex does wait for the load to complete but when it does instr_valid_clear_o from ibex_id_stage should be asserted whilst halt_if in ibex_controller is asserted. This has the effect of clearing the completed instruction out of the ID stage and preventing the next one from being supplied. This allows the controller state machine to jump to the IRQ handler.

instr_valid_clear_o isn't asserted when the load is complete so things go wrong, the load ends up re-executing and as the load uses its source register as a destination the load is to a different address which is unaligned, generating two data bus accesses.

Matching this behaviour against the current RTL in master this should work fine. In particular we have a retain_id signal that isn't present in your VCD file. This was added in this commit: https://github.com/lowRISC/ibex/commit/5ecaa11c635b259d644532c081a2c6818740f43c which fixed a very similar sounding issue that effected configurations with the writeback stage enabled. Note that this commit alone may not fix your issue. We don't supply point fixes to older RTL versions so you'll either need to update to our master branch or produce your own fix.

GregAC commented 3 years ago

I've done a little more investigation, looking at this scenario in our DV and I believe the issue isn't present in the current Ibex master, however where it is present you may need a decent number of iterations of our tests to see a failure (see #1342).

apollojason commented 3 years ago

Hi @GregAC : I agree with your analysis result. The same reason i have found in my VCD. I think the key points are instr_valid_clear_o and instr_valid_id_q. The instr_valid_id_q will affect the stall_mem. The version i have used is 'cf33bfeae0a54cf9cf4790655437cde0a1917862', this is the commit ID. So, do you have any workaround method for this version IBEX? Such as Software programing method or RISC-V gcc-complier method.

Best Regards

Jason_Wu

apollojason commented 3 years ago

Dear @GregAC : If you have any suggestions, please let me know.

Best Regards

Jason_Wu

GregAC commented 3 years ago

Hi @apollojason apologies for the slow response.

As I said above we don't do point fixes to older versions of Ibex but I've taken a quick look. It is possible adding the id_wb_pending signal that's introduced in https://github.com/lowRISC/ibex/commit/50be975226e8d8a16f405e20e3dcd55c4b89a66e might fix it (potentially you can just cherry pick that commit). This will stop the controller from halting IF and proceeding to the IRQ_TAKEN until the load has totally cleared the pipeline.

You might also need the changes in https://github.com/lowRISC/ibex/commit/5ecaa11c635b259d644532c081a2c6818740f43c but I can't say for sure (they're targetted at an issue in the 3 stage config so perhaps you can ignore these changes).

apollojason commented 3 years ago

Dear @GregAC : Thanks for your help! I will have a try and give you a feedback.

Best Regards Jason_Wu

rswarbrick commented 3 years ago

Hi there! I'm closing this issue now, since this is fixed on the main development branch. Thank you very much for the report, though: it's the sort of thing that will hopefully drive improvements to our verification work.