pulp-platform / snitch_cluster

An energy-efficient RISC-V floating-point compute cluster.
https://pulp-platform.github.io/snitch_cluster/
Apache License 2.0
48 stars 46 forks source link

Cluster wrapper generation fails for cluster with `xssr` fully disabled #101

Closed otoomey closed 6 months ago

otoomey commented 6 months ago

Due to a bug in util/clustergen/cluster.py it is currently impossible to instantiate a cluster with no xssr cores. This is because num_ssrs_max must be equal to at least 1 for the following template to produce valid system verilog:

localparam snitch_ssr_pkg::ssr_cfg_t [${cfg['num_ssrs_max']}-1:0] SsrCfgs [${cfg['nr_cores']}] = '{
...
}

A simple one line fix to cluster.py appears to solve the issue:

self.cfg['num_ssrs_max'] = max(len(core['ssrs']) for core in cores)
# to
self.cfg['num_ssrs_max'] = max(max(len(core['ssrs']) for core in cores), 1)

I don't know enough about how the xssr configuration works to determine if my fix is sufficient -- I haven't managed to get the CI test suite to fully pass. At least the following fails.

/workspaces/stitch_cluster/target/snitch_cluster/sw/tests/build/alias.elf test failed

Given that xssr is disabled it seems reasonable that some might fail. Unfortunately, my laptop doesn't have enough memory to run the full suite in a reasonable amount of time.

otoomey commented 6 months ago

@colluca is there a questions/discussion page or board for this project?

colluca commented 6 months ago

@otoomey thanks for the report :)

We don't have a discussion page for the project, but this was the right place to report the issue.

Could you open a PR with the suggested fix? I will take it from there.

otoomey commented 6 months ago

@colluca I wanted to ask if the direct accelerator write register datapath (via result_select) in snitch_fp_ss is legacy code or a placeholder for something currently being worked on, and wasn't sure where to post my question.

paulsc96 commented 6 months ago

Hi,

I seemed to have missed this discussion. Note that this is not a bug, this is intentional as a zero parameter here would break parameterization in all kinds of ways.

The correct way to disable SSRs is to... well... disable SSRs using the xssr parameter, in which case the SSR count parameter is irrelevant. I am pretty sure your fix will break things, as this was a very deliberate choice on my side.

otoomey commented 6 months ago

Hello Paul, If you try to build using the following cluster config you will see that disabling xssr using just the xssr parameter currently produces invalid verilog.

Perhaps there is another way to disable xssr that I haven't seen/missed?

paulsc96 commented 6 months ago

Hi,

I think I misunderstood what you meant here. Indeed, a num_ssrs_max of 0 would lead to invalid verilog and your fix is correct, but this has nothing to do with xssr fundamentally. It just happens to be the case that setting xssr to false for all cores triggers the bug; setting it to false for individual cores works fine.

paulsc96 commented 6 months ago

Fixed in #106.