I think there's reason to merge the L0 and L1 levels for the sake of simplicity. Both can be seen as fixed latency with L0 having zero latency. There are currently some key differences in the spec but I'm not sure all of them are relevant.
L0 is considered stateless
The following simplified (and untested) module is a zero latency CFU with state
I think that reflects a potential real use case as well where something is precalculated in the CFU to minimize latency.
L0 has no clock
Personally, I think the clock should be outside of the spec but the reason is probably more taste than technical. If they remain in the spec, clk, and the closely coupled clk_en, could be made optional.
Conclusion
I think the above irons out all the essential differences between L0 and L1. Now, one could argue that L0 gives the requestor less time to respond, but with a fixed latency it's always the responsibility of the requestor to make sure it can handle a response it has requested at the time it comes back.
The think I'm less certain about is if this introduces routing problems with shared responders, but I can't see how that works in any fixed-latency setting either. So perhaps that's a non-issue too
I think there's reason to merge the L0 and L1 levels for the sake of simplicity. Both can be seen as fixed latency with L0 having zero latency. There are currently some key differences in the spec but I'm not sure all of them are relevant.
L0 is considered stateless
The following simplified (and untested) module is a zero latency CFU with state
I think that reflects a potential real use case as well where something is precalculated in the CFU to minimize latency.
L0 has no clock
Personally, I think the clock should be outside of the spec but the reason is probably more taste than technical. If they remain in the spec, clk, and the closely coupled clk_en, could be made optional.
Conclusion
I think the above irons out all the essential differences between L0 and L1. Now, one could argue that L0 gives the requestor less time to respond, but with a fixed latency it's always the responsibility of the requestor to make sure it can handle a response it has requested at the time it comes back.
The think I'm less certain about is if this introduces routing problems with shared responders, but I can't see how that works in any fixed-latency setting either. So perhaps that's a non-issue too