Closed lsteveol closed 3 years ago
Hey @lsteveol,
I need to write some docs for specifying the clock frequencies, but what you're seeing is the the current expected behavior. Clocks whose frequencies aren't specified diplomatically (you should see some stdout for the clock sinks that are) are dumbly assigned a default value (specified with this key.
In the more heavily multiclocked configurations you're going to have various things dangling off the {s,p,c,f}buses. Each of these sinks is going to add another clock to the clock group, but the name it takes on depends on the sort of crossings you've introduced in the system. Currently, in order to commute the sbus frequency to other devices attached to the sbus you'll need to use this config.
Here's an example: https://github.com/ucb-bar/chipyard/blob/dev/generators/chipyard/src/main/scala/ConfigFragments.scala#L180
This easiest fix is to use this and match on "subsystem_sbus_6" and "subsystem_sbus_7". A better solution is to write a custom matcher that uses a regex to match on "subsystemsbus[0-9]*". I can probably check in the latter thing since that will produce the expected result most of the time.
Hey @davidbiancolin ,
Thanks for the tips. I will experiment on my side. I do have an additional question. I'm seeing the following error when running some sims.
Assertion failed: 'A' channel re-used a source ID (connected at SystemBus.scala:42:55)
at Monitor.scala:44 assert(cond, message)
I found that there was an issue on the RocketChip repo that was similar https://github.com/chipsalliance/rocket-chip/issues/505
Particularly the user mentioned MultiClockCoreplex (I know it's an old issue).
This happens right out of reset when the CPUs are trying to read the bootrom. It appears to me that each of the CPUs (5 in this case) are trying to read the bootrom out reset and something in the system_bus_xbar
is converting the sources in correctly. The Assertion is triggered by on of the TLMonitors. Here is a snapshot. You can see that the a_bits_source
is sending 0x82
twice in the same transaction.
If I disable the STOP_COND, it appears that the CPUs read data from the bootrom. I'm going to try a few things out, but figured I would ask if this had been seen internally.
Thanks again.
@lsteveol after applying @davidbiancolin 's fixes, and changing new chipyard.config.WithFbusToSbusCrossingType(SynchronousCrossing())
to use an async crossing, I found that your config could run simple tests without hitting any TLMonitor assertions.
The assertion you posted above, however, does not look like the same assertions I saw, which were more obviously caused by clocking issues. Does this assertion go away as well when you apply the clocking fixes?
Hey @jerryz123 ,
Just to be sure, are you saying you saw the assertion issues when the FbusToSbusCrossingType
was Synchronous and not when it was async?
I'm also not 100% sure what the fixes @davidbiancolin was actually discussing. It seemed like his fix was mainly for the naming/freq-summary, but I honestly wasn't sure. Are you saying that the internal clocking crossings are changed with respect to what he was saying?
I see that WithTileFrequency
extends the ClockNameContainsAssignment
but I honestly don't see how that really works. In the rocket-chip repo I see that the TilePRCIDomain
has a ClockSinkNode
in which the name variable is given "core_$id"
, but I wasn't sure if that is how those are linked or not.
Do you have an example of how to apply those fixes?
Thanks
Yes, to be clear, the two changes I had to make were
1) change FbusToSbusCrossingType
from Synchronous to Async
2) Add new chipyard.ClockNameMatchesAssignment("subsystem_sbus_6", 750.0) ++
to the config, as @davidbiancolin suggested
A problem was that the subsystem_sbus_6
named clock actually drives the L2 clock group. The current behavior for clocks which don't request a frequency (like the L2 clock) is to take the PBus frequency, which in this case, is not what is intended, since the L2 should actually take the same clock as the SBus. The ClockNameMatchesAssignment
fix explicitly sets the frequency for the L2 clock group to the same as the SBus frequency.
The clock group name for each Tile contains the string "core_$id"
. Thus, WithTileFrequency
is the same as ClockNameContainsAssignment("core",
OK let me try that out. Thanks for the example.
Does ClockNameMatchesAssignment
deal with the construction of the AsyncCrossing
logic and/or how the connections are handled? I was seeing the asource/asink
blocks placed in the couplers between busses, so I was making an assumption that everything was proper from a CDC standpoint. And sims seemed to even validate this for the most part (although they were just simple simulations sending memory transactions between the various busses).
ClockNameMatchesAssignment
only sets the requested frequencies for the various clock groups. The other fragments (ex: WithRationalMemoryBusCrossing
) add the crossing logic. Unfortunately, the code currently does not automate insertion of crossing logic based on the requested frequencies, although it is conceivable that this feature may be added in the future.
Ok I have some more feedback based on your comments. So I had spent quite a bit of time validating the clock domains and I was pretty sure I had most of them correct. There were some things such as, the bootrom is connected to the CBUS, but even if there is a subsystem_cbus clock, the bootrom is still connected to the implicit clock defined. These had be a little tripped up at first, but I worked through them.
I was not really sure why I was getting a TLMonitor assertion. It didn't appear to be clocking related (meaning I didn't think I had the clocks connected incorrectly in my testbench). @jerryz123 saying that he saw issues with the FbusToSbusCrossing
as synchronous had me wonder if something was going on with that. All of my bus crossing were changed to async, but I still had issues. As an experiment I change the WithRationalRocketTiles
to WithAsynchronousRocketTiles(8,3)
and now I do not see the assertions. This testing is fairly preliminary, however I can confirm that I was seeing the assertion with the Rational and only changed to Async and they went away.
I will try to see if I can do some more debug over the next few days/weeks. I don't want to say there is a bug, as I really don't understand the hierarchy enough at the moment to make that claim, but maybe this is helpful information for you guys.
Impact: software
Tell us about your environment: Chipyard Version: branch:
dev
hash: 37415157d61ec9ccdaeb11da75028afdf990461e OS: Other: First, mad gratitude for the separate clock domains!What is the current behavior? I have the current custom Config (I have removed some non-related pieces)
The
.freq-summary
has the following. subsystem_sbus_6 and subsystem_sbus_7 are labeled as 100MHz, but it appears that these go to the L2 (in thisConfig
uration).I am also generating with
TOP=DigitalTop
, so I don't have theCHipTop
instantiating the dividers. I'm guessing I would have not seen an issue if I was using theChipTop
level.What is the expected behavior? The clocks be the correct freq.
Other information
This info isn't really required, just to help with debugging later if someone else sees something similar.
I wasn't really sure of the relationship between the clock signals on the design and the
freq-summary
. I followed them as described in the output but was seeing strange behavior.I was sending an READ AXI transaction into the
FBUS
and looking for it on an APB port on thePBUS
.At first one of the
TLMonitors
was complaining and$fatal
ing out my sim. It wasn't apparent what was going on, so I disabled the `STOP_COND.The AXI transaction would be accepted by the
FBUS
and I would actually see the APB transaction on thePBUS
but it kept reading indefinitely. It was quite strange. It could have been my clocking relationship directly, so who knows.