Closed timsherwood closed 9 years ago
I was originally thinking that we could implement a "syncmem" as only being addressed by a register. Thus something like this would be legal:
mem = SyncMemBlock(...)
r = Register(...)
x = mem[r]
That should be a sufficient requirement to be synthesized as a synchronous memory. However it might be overly restrictive. If you do this:
mem = SyncMemBlock(...)
r = Register(...)
w = WireVector()
w <<= r
x = mem[w]
It would say that it is not a legal SyncMemBlock (since it is addressed with a wirevector). However, no logic stands between the register and the memory and thus (I believe) synthesis from verilog would be find turning this into a SyncMem.
Options: 1) Be conservative and require the address be a Register. Note that the register value can be passed around through functions just fine, it is only when you do assignments through WireVectors that it is in issue. 2) Add a deep new assertion (a dataflow analysis) into core to check that it flow directly between a register a memory. This would be more intellectually interesting (good for papers), but we would need to verify that the generated verilog actually would produce syncmems (which takes people-time) 3) Add dependent types to python? :)
I am leaning towards (1) above to start since it is easy and will let us get stuff done ASAP, with the idea that if it proves to be to onerous we could revisit in the future.
I agree with option 1. I think the above example (and variations of it) are the only place where we would have this issue. Going with option 1 will not result in meaningful loss of functionality. Also, with a good error message, I think it should be easy for the user to resolve the issue. Also, as you said, we could revisit it in the future if this becomes a problem. (it is easy to loosen requirements, not easy to tighten)
Is there a reason you're not considering making the register implicit? If SyncMemBlock
exclusively appears in the pattern you indicated (register a value, use that as address), then you could have it addressed in the same way as an async MemBlock
, but registers are automatically generated to hold the address and data. There isn't much of a sacrifice in clarity, because the user explicitly chooses a SyncMemBlock
instead of a MemBlock
. It actually makes the separation cleaner: there's a semantic difference in the two block types, not just an added restriction on how you use it.
That is a good question. One reason I moved away from it is because I really think of memories as async. When you write mem[foo]
it naturally means to me, take the value of foo
and use it to index mem
immediately (not after one cycle). Often I want to go back and look at the design to figure out if I can move logic around to make it a syncmem instead. In this way I can design something that works, and then change the memories to "syncmems" to help me check that I did it right. The smaller semantic difference both helps and hurts.
The other reason I did it this way is simply because I have not seen anyone else try to do it this way -- so I wan't to see what would come from it... answer thus far is I don't know. :)
Sherwood posted (in the wrong thread):
One option I am now considering (after reading this again) is "why not both":
m = MemBlock(addrwidth=2, bitwidth=3, checksync=True) -- like a normal memblock, but checks that addresses are registers
m = SyncMemBlock(addrwidth=2, bitwidth=3) -- adds "register" functionality in automatically?
Eh, I'm not so sure about that:
There should be one-- and preferably only one --obvious way to do it.
https://www.python.org/dev/peps/pep-0020/
I don't see the big benefit from having two different (but almost the same) methods of doing the same thing. It'll mean that there will be more code to maintain, and it doesn't confer any particular benefit.
I am slightly for SyncMemBlock class structure (seems less arbitrary), though I would still like the register item being in the logic block underneath. I really don't like checksync, that is much more messy than a new class (even though I would rather have fewer classes)
MemBlock
always being asynchronous and SyncMem
having an implicit register seems like the best set up. There is exactly one, clear way to do each thing, and it does not introduce new checks or errors for misuse of SyncMems. I empathize with the desire to have mem[foo]
always be asynchronous, but that desire might be coming from the software world, where one sees it as an array access. On your point about changing a design to be synchronous, I'm not sure why adding a register in front of a MemBlock
is preferable to changing MemBlock
to SynMem
.
Now that I've said that, I think the "automated register detection and optional synchronous memory synthesis" is a cool HDL design exploration. Whether or not "interesting" is enough motivation to be folded into main pyrtl is a decision above my pay grade.
Hmmm... If we want to be scientific I think we need to design some things both ways and see what happens. I could make the default behavior for MemBlock
to generate a memory that checks that it is indexed by a register (and throws PyrtlError
if it is not). A flag would be available to turn off the check (and thus indicate that it will likely synthesize as a async memory). While there are some very good arguments agains laid out above, here are my refined arguments for the "unified" approach:
That said, a big point against is:
I we include the "checker", and people still really want SyncMem I could be convinced to have a (MemBlock and SyncMemBlock).
If we do abandon the "checker" approach then I think we need to move to MemBlock
which is then sync, and AsyncMemBlock
which makes it clear that sync is the preferred embodiment. Varun (now a professional hardware slinger) says this: "it is best practice to always make sure your block memory/fifos read/write operations start on a clock edge."
I think that would be a good change regardless of which approach we go with. Even if we do use the checker, the only way to disable the check should be to use an AsyncMemBlock
.
Here is an interesting case from our unit tests. Addr is always ready on the clock edge, but there is not a clean way to refactor the register into the memory. Likewise the simple "check if input is register" also fails because the input is a "selection" of the register.
def test_basic_true_condition_memread(self):
m = pyrtl.MemBlock(addrwidth=2, bitwidth=3, name='m')
i = pyrtl.Register(bitwidth=3, name='i')
o = pyrtl.WireVector(bitwidth=2, name='o')
i.next <<= i + 1
addr = i[0:2]
with pyrtl.ConditionalUpdate() as condition:
with condition(i < 4):
m[addr] |= i
with condition.fallthrough:
with condition(addr[0]):
# this should happen every time because no
# state is being updated!
o <<= m[addr]
self.check_trace('i 01234567\no 00000123\n')
Oh damm. I didn't think of slices. Looks like we need to make a function that can see through those ops ( and the concat one as well, which will be critical to making post synth syncmem working)
Okay, the above problem is fixed -- I added a true dataflow analysis to sanity check that will allow things like concat, split, and wire but not any actual logic operations in the memory index. It's actually pretty neat how easy it was (although now that I think about it this could be done even faster and easier with toposort!). Oh well, it only has to happen once. The code is as below:
# propagate "async" behavior through the network
while async_set_is_growing:
for net in self.logic:
if net.op in async_prop:
if any([w in async_set for w in net.args]):
async_set.update(net.dests)
async_set_is_growing = True if len(async_set) > last_async_set_len else False
last_async_set_len = len(async_set)
# now do the actual check on each memory operation
for net in self.logic:
if net.op == 'm':
is_async = net.args[0] in async_set
if is_async and not net.op_param[1].asynchronous:
raise PyrtlError(
'memory "%s" is not specified as asynchronous but has and index '
'"%s" that is not ready at the start of the cycle'
% (net.op_param[1].name, net.args[0].name))
So how do you turn such a check off? Well you declare your memory as:
MemBlock(bitwidth, addrwidth, asychronous=True)
The asynchronous
flag is set to False by default. Interestingly pretty much 99% of the unit tests were synchronous already and required no refactoring! I tried to make the flag very clear and long enough to be obnoxious to encourage sync mems. The only one that did not pass had an "and" operation in the index, and I went ahead and marked it as async. This approach is not set in stone, but I want to try it out, and since I did all the coding I don't feel too bad :)
Have you checked that this code creates is interpreted correctly by Verilog?
Need to modify the memories to support both sync and async memories. Sync memories will only be addressable directly by a register.