Issue Queue Implementation

AaronGChan commented 6 months ago

This PR adds issue queue modeling to Olympia, by allowing users to define the number of issue queues, which execution units map to which issue queues, and which target pipes are supported for each execution unit.

Regressions will fail until we merge scoreboard fix in Sparta. Old flow: dispatch -> executepipe->execute

New flow: dispatch -> issue queue -> executepipe -> execute

danbone commented 5 months ago

https://github.com/riscv-software-src/riscv-perf-model/blob/fe146aedc68daf9ae033515876c133ede07496cb/arches/medium_core.yaml#L18-L25

Could we remove the execution_topology part? seems redundant as the pipetopology*_pipes implicitly defines the number of functional units?

And as a further improvement, we just have a single nested list defining the issue queues and the functional units like:

[ IQ0, [ALU0, [PIPE0, PIPE1], ALU1, [PIPE2]], IQ1 ..

Example:

execution_topology:
[["iq0", 
        ["alu0", ["int", "mul", "i2f", "cmov"]],
        ["alu1", ["int", "div"]],
        ["br0", ["br"]]],
 ["iq1", 
        ["alu2", ["int"]],
        ["br1", ["br"]]]]

Though I'm not sure if the API supports nesting like that.

AaronGChan commented 5 months ago

https://github.com/riscv-software-src/riscv-perf-model/blob/fe146aedc68daf9ae033515876c133ede07496cb/arches/medium_core.yaml#L18-L25

Could we remove the execution_topology part? seems redundant as the pipetopology*_pipes implicitly defines the number of functional units?

And as a further improvement, we just have a single nested list defining the issue queues and the functional units like:

[ IQ0, [ALU0, [PIPE0, PIPE1], ALU1, [PIPE2]], IQ1 ..

Example:
execution_topology:
[["iq0", 
        ["alu0", ["int", "mul", "i2f", "cmov"]],
        ["alu1", ["int", "div"]],
        ["br0", ["br"]]],
 ["iq1", 
        ["alu2", ["int"]],
        ["br1", ["br"]]]]
Though I'm not sure if the API supports nesting like that.

@klingaard @kathlenemagnus thoughts on this/any preference? I'm open to trying something like this because it simplifies the walking of the .yaml to assign executepipes to issue queues, not sure of the C++ aspect of it given it would be a mix of string and vector in a vector. One thought I had would be to do it like this:

execution_topology:
[
[["alu0", "int", "mul", "i2f", "cmov"], //iq0
        ["alu1", "int", "div"],
        ["br0", "br"]],
[["alu2", "int"], // iq1
["br1", "br"]],

[["fpu0", "float", "f2i"], // iq2
["fpu1", "float", "faddsub"]],

]

where we have a triple nested vector, where the first vector is each issue queue, 2nd vector is all the execution units in that issue queue, and the 3rd vector are strings, where the first string is the execution unit type, and the subsequent strings are the pipe targets. The only downside I see of nesting it like this is that it makes it a little unclear how many of each execution units there are due to the nesting, while what we have right now is pretty clear on all the definitions. Open to modifying accordingly to what people have a preference for.

klingaard commented 5 months ago

First, I agree with @danbone, the execution_topology is redundant and really doesn't add anything. Reducing this down to 1 parameter that defines the pipelines nested in a way that can infer the issue queues counts works for me.

I like the suggest you put forth, but unfortunately I don't think the sparta Parameter class supports more than two levels of nesting without modification. Unfortunately this limitation forces a layout that requires at least 2 or more parameters. One that defines the number of pipelines, another that maps the pipelines to an issue queue, and another that maps dispatch queues to issue queues.

Let me see if I can elaborate it with some general use cases using what is supported.

Mapping of a 1 -> 1 -> 1

6 Dispatch queue per issue queue
6 Issue queues

6 Pipes


pipelines:
[
["int", "mul", "i2f", "cmov"], // pipe0
["int", "div"],                // pipe1
["br"],                        // pipe2
["int"],                       // pipe3
["float", "f2i"],              // pipe4
["float", "faddsub"]],         // pipe5
]

issue_queue_to_pipe_map: [ [ "alu0_iq" , "0"], [ "alu1_iq" , "1"], [ "alu2_iq" , "2"], [ "alu3_iq" , "3"], [ "fpu0_iq" , "4"], [ "fpu1_iq" , "5"], ]

dispatch_queue_to_issue_queue_map: [ [ "disp0", "alu0_iq"], [ "disp1", "alu1_iq"], [ "disp2", "alu2_iq"], [ "disp3", "alu3_iq"], [ "disp4", "fpu0_iq"], [ "disp5", "fpu1_iq"], ]


### Mapping of 2 -> N -> M
* 2 Dispatch queues
* 3 Issue queues
* 6 pipes

pipelines: [ ["int", "mul", "i2f", "cmov"], // pipe0 ["int", "div"], // pipe1 ["br"], // pipe2 ["int"], // pipe3 ["float", "f2i"], // pipe4 ["float", "faddsub"]], // pipe5 ]

issue_queue_to_pipe_map: [ [ "alu0_iq" , "0", "1"], [ "alu1_iq" , "2", "3"], [ "fpu_iq" , "4". "5"] ]

dispatch_queue_to_issue_queue_map: [ [ "disp0", "alu0_iq", "alu1_iq"], [ "disp1", "fpu_iq"] ]



Of course you can get fancy with the yaml here if you want or create your own "language" to get around the 3 deep nested parameter.  Up to you.

AaronGChan commented 5 months ago

@klingaard I'm working through the changes we discussed yesterday around making each execution unit a generic name and not bounding an execution unit type to it. However, when it comes to branch unit, I run into the issue when testing branch misprediction. https://github.com/riscv-software-src/riscv-perf-model/blob/ce29ae25e3e026db1c4e36f48068e68efef2d091/test/sim/CMakeLists.txt#L74 The parameter for branch misprediction is tied to "br*", which doesn't exist anymore. For the automated testing, a solution I have currently is just to check based on the .yaml passed, I specify which "exe" is a branch, so in bigcore it could be "exe8" and smallcore it could be "exe3".

In general though are we ok with this change, it gives more burden on the user to make sure they define everything and map everything correctly. For example this assert now fails given we don't have a specific excution unit group anymore, which I can delete if we want to move forward with this change, but wanted to double check we're ok with this approach. Or do we want to make an exclusion for branches and have them specifically defined. https://github.com/riscv-software-src/riscv-perf-model/blob/ce29ae25e3e026db1c4e36f48068e68efef2d091/core/ExecutePipe.cpp#L51

klingaard commented 5 months ago

Regarding the random branch prediction parameter, there's no harm in doing this:

 -p top.cpu.core0.execute.exe*.params.enable_random_misprediction 1

This, of course, does not address the assert that you see. For that, I suggest that the assertion be removed and instead of checking for the node name, look at the pipes assigned. If the queue has a BR pipe, THEN it should pay attention to this parameter. Otherwise, it's ignored by the pipe.

klingaard commented 5 months ago

One other piece of cleanup -- I think you can remove the dispatch section in the arches/isa_json/*.json files.

klingaard commented 5 months ago

@AaronGChan you ready for me to merge this?

AaronGChan commented 5 months ago

@AaronGChan you ready for me to merge this?

Yep I am, was waiting to see if any other comments. I have the email drafted to send to sig as well once we merge.

klingaard commented 5 months ago

Alright! Let's do it. If someone has questions/issues with this, then another PR can be posted.

riscv-software-src / riscv-perf-model

Issue Queue Implementation #140

Mapping of a 1 -> 1 -> 1