Updating definition of commit interface

christian-herber-nxp commented 8 months ago

This is an attempt to generalize the commit interface, which would also enable a flush signal in simple configurations.

jklockars commented 8 months ago

Is it really necessary to support multiple kill/commit in one cycle?

The instructions will obviously need to be tracked in the normal pipeline to eventually be killed or committed. And since the interface only supports the issue of one instruction at a time, it would not seem unreasonable to also only allow for one instruction to be killed/committed at a time. This will, of course, require the normal pipeline to keep track of which instructions that need to signal that they have been killed rather than committed at the appropriate stage.

When only a single kill/commit per cycle is possible, it is enough for each coprocessor pipeline stage to have the equivalent of if stage_valid and kill then stage_valid := kill_id != stage_id end if and the same kind of thing for "committed".

If multiple kill/commit is possible, you would instead need to do some kind checks against where the stage ID is compared to the latest issued ID and the currently killed/committed one (since the ID's will obviously need to wrap around). While this is certainly doable, it will be significantly more complicated than just comparing two IDs.

Addition: I forgot that the current interface does not define the ID:s as being sequential at all, which would then force the coprocessor to create its own internally ordered tags and then map incoming kill/commit ID:s to those. Further complicating things in the case of multiple kill/commit per cycle.

christian-herber-nxp commented 8 months ago

Is it really necessary to support multiple kill/commit in one cycle?

The instructions will obviously need to be tracked in the normal pipeline to eventually be killed or committed. And since the interface only supports the issue of one instruction at a time, it would not seem unreasonable to also only allow for one instruction to be killed/committed at a time. This will, of course, require the normal pipeline to keep track of which instructions that need to signal that they have been killed rather than committed at the appropriate stage.

When only a single kill/commit per cycle is possible, it is enough for each coprocessor pipeline stage to have the equivalent of if stage_valid and kill then stage_valid := kill_id != stage_id end if and the same kind of thing for "committed".

If multiple kill/commit is possible, you would instead need to do some kind checks against where the stage ID is compared to the latest issued ID and the currently killed/committed one (since the ID's will obviously need to wrap around). While this is certainly doable, it will be significantly more complicated than just comparing two IDs.

Well, I would say this change makes it simpler for the CPU, and more complex for the coprocessor (i.e. you could still serialize the commit signals internally). On the other hand, the proposed change would allow potentially a faster flush of the pipeline, whereas before you need N cycles to kill off N instructions.

jklockars commented 8 months ago

I made a late addition to my previous comment, which I am including here in case you missed it:

"I forgot that the current interface does not define the ID:s as being sequential at all, which would then force the coprocessor to create its own internally ordered tags and then map incoming kill/commit ID:s to those. Further complicating things in the case of multiple kill/commit per cycle."

As you say, the coprocessor could indeed serialise things itself, but that could get really "interesting" if the main pipeline keeps killing/committing things while those serialised kills/commits are happening.

christian-herber-nxp commented 8 months ago

I made a late addition to my previous comment, which I am including here in case you missed it:

"I forgot that the current interface does not define the ID:s as being sequential at all, which would then force the coprocessor to create its own internally ordered tags and then map incoming kill/commit ID:s to those. Further complicating things in the case of multiple kill/commit per cycle."

As you say, the coprocessor could indeed serialise things itself, but that could get really "interesting" if the main pipeline keeps killing/committing things while those serialised kills/commits are happening.

you will not get back to back pipeline flushes. After a pipeline flush, another instruction will first have to reach the end of the pipeline before it can trigger another one, so you will have a clear time window to finish the killing.

jklockars commented 8 months ago

Regarding back-to-back pipeline flushes, I may indeed have been overly pessimistic there. ;-)

If only one instruction can be issued to the coprocessor per cycle, you should indeed have time to also kill them all. But then again, there would also be time for the main pipeline to do it explicitly for each ID (and there would no longer be any worry about internal ordering of random IDs in the coprocessor).

christian-herber-nxp commented 8 months ago

i agree that things will get complex once you have a coprocessor with a complex pipeline, that goes out of order and multiple instructions execute in parallel. But then also when you issue a commit transaction per instruction, it is non-trivial to relate this to the right instruction. For simple pipelines, it will be easy for both.

jklockars commented 8 months ago

Anyway, is potentially faster flushing of the coprocessor pipeline a significant issue? You can still only issue one instruction per cycle, and things will usually progress one stage at a time in the coprocessor.

For our pipelined RISC-V FPU I did consider making kills automatically affect everything issued later (I believe we did that in our SPARC FPU), but decided against it. It seemed to complicate things for very little gain (the FPU actually does skip pipeline stages when possible if there is nothing "in the way", so it could have made a bit of difference).

The FPU does divide and square root in a separate (iterative) unit, and FPU->integer instructions have their own pipeline, so there is out-of-order completion (things can actually also get out-of-order in the main FPU pipeline, which handles all instructions with FPU destination, due to stage skipping). With one kill/commit per cycle it is really simple to keep this working correctly. It would be decidedly unpleasant to do multiple kill/commit of random IDs per cycle...

christian-herber-nxp commented 8 months ago

Anyway, is potentially faster flushing of the coprocessor pipeline a significant issue? You can still only issue one instruction per cycle, and things will usually progress one stage at a time in the coprocessor.

If you are killing one instruction per cycle, it could be that an instruction is not yet killed, which will cause a pipeline stall, especially for things in iterative units.

For our pipelined RISC-V FPU I did consider making kills automatically affect everything issued later (I believe we did that in our SPARC FPU), but decided against it. It seemed to complicate things for very little gain (the FPU actually does skip pipeline stages when possible if there is nothing "in the way", so it could have made a bit of difference).

The FPU does divide and square root in a separate (iterative) unit, and FPU->integer instructions have their own pipeline, so there is out-of-order completion (things can actually also get out-of-order in the main FPU pipeline, which handles all instructions with FPU destination, due to stage skipping). With one kill/commit per cycle it is really simple to keep this working correctly. It would be decidedly unpleasant to do multiple kill/commit of random IDs per cycle...

I can sympathize with what you are stating. We just need to find a common solution

jklockars commented 8 months ago

Yes, iterative units may indeed be subject to some unnecessary issue stall cycles, when the specific killed instruction was not actually in that unit.

Out of interest, what would be the circumstances when it would be difficult to relate a single commit to the correct instruction in a coprocessor pipeline? Propagation time, perhaps? Otherwise (as in my case where this is not timing critical), it seems simple enough to just have all the pipeline stages (and non-pipelined units) check their current instruction IDs against the new commit ID and mark when its the same.

christian-herber-nxp commented 8 months ago

Yes, iterative units may indeed be subject to some unnecessary issue stall cycles, when the specific killed instruction was not actually in that unit.

Out of interest, what would be the circumstances when it would be difficult to relate a single commit to the correct instruction in a coprocessor pipeline? Propagation time, perhaps? Otherwise (as in my case where this is not timing critical), it seems simple enough to just have all the pipeline stages (and non-pipelined units) check their current instruction IDs against the new commit ID and mark when its the same.

Ah yes you are totally right. That will work fine. If you have a single commit transaction to flush, you will have to do sort of the same, but signal the next good ID (the one that will be issued next) to each unit and only stop killing when you see that or something greater. That would seem reasonably easy to me

Silabs-ArjanB commented 8 months ago

Hi @christian-herber-nxp I see a comment above that there will no back-to-back pipeline flushing. I don't agree with that. We should allow that (and it can easily happen for example because of a branch causing a flush immediately followed by a taken interrupt causing a flush).

jklockars commented 8 months ago

If you have a single commit transaction to flush, you will have to do sort of the same, but signal the next good ID (the one that will be issued next) to each unit and only stop killing when you see that or something greater. That would seem reasonably easy to me

But the current document says that the IDs are "random", so there is no way to know a next good ID.

Why not specify incrementing IDs (not necessarily all reaching the coprocessor, but at least ordered)? Of course, to be useful they must also be guaranteed to be large enough that no new ID can ever be the same as one already in use.

christian-herber-nxp commented 8 months ago

If you have a single commit transaction to flush, you will have to do sort of the same, but signal the next good ID (the one that will be issued next) to each unit and only stop killing when you see that or something greater. That would seem reasonably easy to me

But the current document says that the IDs are "random", so there is no way to know a next good ID. I did not realize that random IDs are a thing. @Silabs-ArjanB : what is the rationale?

Why not specify incrementing IDs (not necessarily all reaching the coprocessor, but at least ordered)? Of course, to be useful they must also be guaranteed to be large enough that no new ID can ever be the same as one already in use. agree. incremenating wrapping should be the requirement

Silabs-ArjanB commented 8 months ago

Incrementing wrapping is fine as well as long as there is no guarantee that a coprocessor will get to see 'increments by 1'. I don't see how this eases the problem of finding the next good id though.

jklockars commented 8 months ago

Incrementing wrapping is fine as well as long as there is no guarantee that a coprocessor will get to see 'increments by 1'. I don't see how this eases the problem of finding the next good id though.

I agree that it does not seem to help. And for an out-of-order coprocessor pipeline you cannot necessarily stop killing operations even if you see the "known good ID", since it might have passed by something that should be killed.

openhwgroup / core-v-xif

Updating definition of commit interface #83