Should Migen generate explicit Verilog for mismatched bit vector lengths (rather then relying on implicit Verilog behaviour?)

mithro commented 6 years ago

@bunnie has been working with LiteX / Migen recently and mentioned he thought following issue would be a big source of bugs in LiteX / Migen designs. I would like to get @sbourdeauducq 's (and others) thoughts on this and if we want to fix it in some way?

Mismatches in bit vector lengths is kicked down the road to Verilog. So,
a = Signal(31)
b = Signal(16)
self.comb += a.eq(b)
Just generates the verilog "assign a = b".

It's best to confirm if the compiler does what you think it does in this case. There is a spec but there are extensions that can, for example, allow for signed numbers.

Confirmed that Vivado's convention is:

if lvalue is wider than rvalue, then LSB-align rvalue and zero-pad

if lvalue is narrower than rvalue, then lvalue gets the LSB's of rvalue

So
wire [15:0] a;
wire[7:0] b; 
assign a = b;
is implicitly
assign a[15:0] = {8'b0, b};
and
wire [7:0] a;
wire[15:0] b;
assign a = b;
is implicitly
assign a[7:0] = b[7:0];

My personal feeling is that Migen should be generating the explicit output it is after rather then relying on Verilog implementations to be sane....

Thoughts?

jordens commented 6 years ago

This is definitely not the only area where we are relying on Verilog implementations being sane. How do you judge between reasonable and dangerous expectation?

bunnie commented 6 years ago

My feeling on this is it helps immensely with code readability. One of the problems of migen is that signal widths are largely implicit and inter-block interfaces aren't clearly defined as you can just yank into any module by referencing an internal signal, and I find myself chasing down ratholes to try and figure out what the original width of a signal was.

As I'm trying to cobble together more complex things in migen I routinely find myself referring to the generated verilog to try and sanity check these problems.

Here is a prime example, I have to specify an address for a DMA target for a DMA engine via a CSR. This address is word-aligned by one intermediate block and then ultimately burst-aligned by the DDR engine. Some blocks are taking the address and shifting them left to do the alignment; others are taking the address and masking out the LSB's (no shifting) to do the alignment. Other blocks are just relying on the verilog practice to LSB-align everything and allowing the unaligned MSB's to silently disappear.

As a result I'm constantly having problems trying to figure out what the actual alignment is of the data being written into the CSR registers by the firmware (is this a byte aligned? word aligned? burst aligned? or hybrid of the set??).

If a migen block always included the intended bit-width suffixes during signal generation, even if redundant, then at least I have one less variable to guess through to figure out problems like this. Based on what I've seen of the Python this knowledge is computed and available at codegen time. I think it would be fine for example if even if like-width signals were explicit, e.g.

wire [7:0] a; wire [7:0] b;

assign a[7:0] = b[7:0];

it's not illegal, and just as legible, especially when the signal declarations are thousands of lines away from the signal uses inside the verilog file, and in theory it shouldn't change anything in the compiled designs, just a readability improvement.

thanks,

-b.

On 02/15/2018 12:44 AM, Robert Jördens wrote:

This is definitely not the only area where we are relying on Verilog implementations being sane. How do you judge between reasonable and dangerous expectation?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/m-labs/migen/issues/102#issuecomment-365668275, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHT9-D8Xa_KyWkCpDf31YaZLHPtzU-Eks5tUw1RgaJpZM4SFlBU.

-- ^'~*-,._.^'~-,._.^`'~-,..^`'~*-,..^'~*-,._.^'~-,._.^`'~-,._.^`'

sbourdeauducq commented 6 years ago

Migen should be generating the explicit output it is after rather then relying on Verilog implementations to be sane....

FPGA tools have many problems, but this isn't one of them. This is a very common case which is even present with a <= a + 1'b1 and I haven't found any FPGA tool bug around there.

My feeling on this is it helps immensely with code readability.

I disagree, I think this clutters the Verilog output. Manually-written Verilog doesn't do that.

One of the problems of migen is that signal widths are largely implicit

How are they implicit? You mean they are not specified again at every statement?

and inter-block interfaces aren't clearly defined as you can just yank into any module by referencing an internal signal,

You can, but this doesn't mean you should. It is up to you to define clean interfaces between your modules.

and I find myself chasing down ratholes to try and figure out what the original width of a signal was.

print(len(signal)) or do more elaborate things with len(signal)? Define a custom class/type for addressing signals based on the number of bytes that one LSB indexes into, and add some code-generation-time sanity checks and fancy methods?

sbourdeauducq commented 6 years ago

you can just yank into any module by referencing an internal signal

This feature is actually very useful e.g. in http://github.com/m-labs/microscope

bunnie commented 6 years ago

On 02/15/2018 10:46 AM, Sébastien Bourdeauducq wrote:

FPGA tools have many problems, but this isn't one of them. This is a very common case which is even present with |a <= a + 1'b1| and I haven't found any FPGA tool bug around there.

I find that to be bad practice, but that's just me...

My feeling on this is it helps immensely with code readability.
I disagree, I think this clutters the Verilog output. Manually-written Verilog doesn't do that.

This statement would make more sense if the output of migen wasn't a single 20,000 line verilog file. It's nice that for the blink demo the migen output looks almost hand-written but when debugging a 20k line verilog file I'll happily sacrifice looking human-written for being more human-readable.

One of the problems of migen is that signal widths are largely implicit
How are they implicit? You mean they are not specified again at every statement?

implicit as opposed to verilog where in the module interface you're supposed to spell out the inputs, outputs, and their widths.

Of curse, you don't need to re-specify it at every statement (and certainly not on the python side), but the lack of enforced centralized signal declaration leads to a lot of reaching in between layers.

So it's helpful to have the machine-generated verilog re-confirm what the intentions were in case you are in an ambiguous situation, rather than single-stepping through the Python to try and figure things out.

and inter-block interfaces aren't clearly defined as you can just
yank into any module by referencing an internal signal,
You can, but this doesn't mean you should. It is up to you to define clean interfaces between your modules.

In this case, I'm trying to use other people's modules. I think migen seems to make a lot of sense to people who wrote their code but the lack of mandatory interface templating makes it a bit write-once in terms of reusability. What I'm asking for is hints to help me re-use third party code that may have been written to an arbitary user's "clean interface" standard.

and I find myself chasing down ratholes to try and figure out what
the original width of a signal was.
|print(len(signal))| or do more elaborate things with |len(signal)|? Define a custom class/type for addressing signals based on the number of bytes that one LSB indexes into, and add some code-generation-time sanity checks and fancy methods?

Again, this is mostly code I haven't written. Yes, I could go in and try to "fix" their implementation, but where things are falling apart is when you're re-using someone else's code in your design. Because people can define their own signal interface standards, they do, and this makes code re-use between unrelated parties a bit challenging in these cases especially when you're just relying on the code as documentation.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/m-labs/migen/issues/102#issuecomment-365811272, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHT95vwilLnFAew9hNGzL0-ACLGnvucks5tU5qIgaJpZM4SFlBU.

-- ^'~*-,._.^'~-,._.^`'~-,..^`'~*-,..^'~*-,._.^'~-,._.^`'~-,._.^`'

bunnie commented 6 years ago

yes, it is incredibly useful, and also incredibly used!

-b.

On 02/15/2018 10:48 AM, Sébastien Bourdeauducq wrote:

you can just yank into any module by referencing an internal signal
This feature is actually very useful e.g. in http://github.com/m-labs/microscope https://github.com/m-labs/microscope

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/m-labs/migen/issues/102#issuecomment-365811497, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHT93M03QThFYxlnVqyH4JSfrAca0Ryks5tU5r2gaJpZM4SFlBU.

-- ^'~*-,._.^'~-,._.^`'~-,..^`'~*-,..^'~*-,._.^'~-,._.^`'~-,._.^`'

sbourdeauducq commented 6 years ago

I find that to be bad practice, but that's just me...

In that case, it should be addressed in Migen as well, maybe in a way similar to Rust's as u... construct (but still, we have to be careful that this doesn't introduce too much clutter, e.g. in arithmetic operations). This requires finding a good Python syntax hack :)

Also, {carry, value} <= value + 1'b1; should be supported.

This statement would make more sense if the output of migen wasn't a single 20,000 line verilog file. It's nice that for the blink demo the migen output looks almost hand-written but when debugging a 20k line verilog file I'll happily sacrifice looking human-written for being more human-readable.

I would actually like to change that and support splitting the output into multiple modules and Verilog files. But this is major work and I do not have time for it at the moment.

Is there a specific module from someone that you are trying to reuse and find the interface problematic?

bunnie commented 6 years ago

I'm currently trying to figure out how the DMA engine inside litevideo is programmed in terms of setting the address slots.

Newer firmware seems to suggest it's (offset from top of RAM, e.g. 0x0), older firmware seems to be it's (absolute addressing, e.g. 0x40000000 + offset).

I have a problem right now on litevideo where enabling DMA hard crashes the CPU and thus I can't even get a scope trace out.

So unfortunately I don't have that tool to debug this, which is why I'm wading through the python and verilog to try and figure out the ground truth on the implied width of the DMA slots.

thanks,

-b.

On 02/15/2018 11:15 AM, Sébastien Bourdeauducq wrote:

I find that to be bad practice, but that's just me...
In that case, it should be addressed in Migen as well, maybe in a way similar to Rust's |as u...| construct (but still, we have to be careful that this doesn't introduce too much clutter, e.g. in arithmetic operations). This requires finding a good Python syntax hack :)

Also, |{carry, value} <= value + 1'b1;| should be supported.
This statement would make more sense if the output of migen wasn't a
single 20,000 line verilog file. It's nice that for the blink demo
the migen output looks almost hand-written but when debugging a 20k
line verilog file I'll happily sacrifice looking human-written for
being more human-readable.
I would actually like to change that and support splitting the output into multiple modules and Verilog files. But this is major work and I do not have time for it at the moment.

Is there a specific module from someone that you are trying to reuse and find the interface problematic?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/m-labs/migen/issues/102#issuecomment-365814745, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHT9yLDSH56nXHL6ESrx4eA3HZOJD06ks5tU6FRgaJpZM4SFlBU.

-- ^'~*-,._.^'~-,._.^`'~-,..^`'~*-,..^'~*-,._.^'~-,._.^`'~-,._.^`'

sbourdeauducq commented 6 years ago

I have a problem right now on litevideo where enabling DMA hard crashes the CPU and thus I can't even get a scope trace out.

With microscope (and probably litescope with the proper options) you just need to get a second UART channel, a CPU is not needed. Also you can perhaps try to simulate the litevideo design.

FWIW, if the DMA core is connected to the SDRAM bus directly, bypassing L2 (is that the LASMI system, that Florent renamed in his fork to LiteDRAM?), it doesn't matter whether you use 0x40000000 or 0x00000000 as reference address. 0x40000000 is just where the L2 cache is mapped in the CPU address space.

mithro commented 6 years ago

FYI - The DMA engine in LiteVideo has changed between absolute addressing and start of the DDR addressing at some point in the past.

bunnie commented 6 years ago

Oh, one thought -- I suspect part of the reason Vivado produces much better, more optimized output from migen code vs the native Vivado tools is that migen creates a single flat file.

It seems Vivado has forgotten how to optimize across module boundaries -- or perhaps there is a flag I'm missing, but I spent a bit of time trying to reduce the size of some vivado designs and in one instance I managed to identify large portions of dead code that was still being compile into gates. Usually tying clock and all inputs to ground and/or tying reset to active is enough to get ISE to cull a dead module but I haven't observed vivado to be so smart, I had to go into the submodule and comment out the block. Could be just me using the wrong compilation flags but I suspect the compiler does better/faster on a single large flat verilog file!

It's harder for analysis and post-processing, but if you do implement hierarchical output please include a flag or option that allows flattening in case my observation about the Vivado tools is correct. That way you could use hierarchical output to aid with debugging/readability but for "production" code you can generate a flat netlist file that's supposedly identical but optimizes better.

thanks,

-b.

On 02/15/2018 11:15 AM, Sébastien Bourdeauducq wrote:

I find that to be bad practice, but that's just me...
In that case, it should be addressed in Migen as well, maybe in a way similar to Rust's |as u...| construct (but still, we have to be careful that this doesn't introduce too much clutter, e.g. in arithmetic operations). This requires finding a good Python syntax hack :)

Also, |{carry, value} <= value + 1'b1;| should be supported.
This statement would make more sense if the output of migen wasn't a
single 20,000 line verilog file. It's nice that for the blink demo
the migen output looks almost hand-written but when debugging a 20k
line verilog file I'll happily sacrifice looking human-written for
being more human-readable.
I would actually like to change that and support splitting the output into multiple modules and Verilog files. But this is major work and I do not have time for it at the moment.

Is there a specific module from someone that you are trying to reuse and find the interface problematic?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/m-labs/migen/issues/102#issuecomment-365814745, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHT9yLDSH56nXHL6ESrx4eA3HZOJD06ks5tU6FRgaJpZM4SFlBU.

-- ^'~*-,._.^'~-,._.^`'~-,..^`'~*-,..^'~*-,._.^'~-,._.^`'~-,._.^`'

jordens commented 6 years ago

I am pretty sure that vivado does the relevant stages of the optimization on the flat netlist. It may do something early on when synthesizing but I don't think that's impacting it. I would actually go as far as saying that relying on implicit verilog behavior and not trying to outsmart the synthesizer is a good thing here.

jordens commented 6 years ago

Supporting a hard interfaces between modules and making it easy and fast to use them has a lot of benefits (see #70). I'd like to see that implemented. But I wouldn't enforce it. It collides with lot of really important metaprogramming use cases in migen.

bunnie commented 6 years ago

So just to confirm, this would in fact be correct programming:

hdmi_in1_frame_overflow_read 0 hdmi_in1_dma_frame_size_read 3f4800 hdmi_in1_dma_slot0_status_read 1 hdmi_in1_dma_slot0_address_read 80000 hdmi_in1_dma_slot1_status_read 1 hdmi_in1_dma_slot1_addess_read 474800 hdmi_in1_dma_ev_status_read 0 hdmi_in1_dma_ev_pending_read 0 hdmi_in1_dma_ev_enable_read 3

hdmi_in1_dma_slot0_address_read says 0x80000, but actually data is being translated and written to 0x40080000 automagically?

-b.

On 02/15/2018 12:14 PM, Tim Ansell wrote:

FYI - The DMA engine in LiteVideo has changed between absolute addressing and start of the DDR addressing at some point in the past.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/m-labs/migen/issues/102#issuecomment-365820960, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHT93C8qlO1p54omFcW9G16L6cUuUJAks5tU68vgaJpZM4SFlBU.

-- ^'~*-,._.^'~-,._.^`'~-,..^`'~*-,..^'~*-,._.^'~-,._.^`'~-,._.^`'

enjoy-digital commented 6 years ago

A bit of context: @bunnie is trying to get a a design with 2 hdmi input working. This is something that has already been validated, but @bunnie's case is a bit specific since only second hdmi is doing DMA. I haven't found time yet to help him with this and he's trying to debug things. One interesting thing is that things that seems clear for us are maybe not that natural without documentation.

@bunnie: the DMA is only able to target the DRAM. So the way you define the addresses depends from which point of view you are seeing things. If your DRAM is only used by some DRAM DMA engines, then yes you can use relative DRAM addresses. But if you want to access if from your CPU (writing a color pattern for example in your framebuffer, you have to use absolute DRAM addresses. (and DRAM is mapped to 0x40000000 so translation is natural).

@bunnie: to help debug, you can also generate litescope as a single core (will include the UART + analyzer) and integrate that in your design. This allow you to debug even when CPU system is crashed. This woud be similar to microscope.

mithro commented 6 years ago

So I think this bug report has gotten a bit off topic here. Can we loop back to just the idea around the question;

Should Migen generate explicit Verilog for mismatched bit vector lengths (rather then relying on implicit Verilog behaviour?)

The summary so far seems to be;

@jordens Seems to be against the idea as it seems like a normal thing to do in Verilog and we rely on Verilog to be somewhat sane in many cases.
@bunnie Thinks that doing this would help reading the generate verilog - specially in larger design cases.
@sbourdeauducq Seems open to the idea of sending Migen to support some type of explicit stuff around this "maybe in a way similar to Rust's as u... construct" but doesn't know what the Python syntax might look like?
@enjoy-digital Has not expressed an opinion yet.

FYI From what I can see both Vivado and ISE generate warnings about both the truncation and extension cases? There also seems to be a huge amount of complaints about this warning existing.

WARNING:HDLCompiler:413 - "my_module.v" Line 123: Result of 25-bit expression is truncated to fit in 24-bit target.

According to some smug people on the internet the "rules are rather explicit in 1364-2005 LRM". This was countered with;

All operands are widened to the same width as the widest - including the left-hand side. Then, do the arithmetic in that width.
Then assign the result to the left-hand side, dropping more-significant bits if necessary.

BUT... it isn't quite that simple. The main difficulties are:

(1) signed vs. unsigned. If ALL the right-hand side operands are signed, then widening is performed by sign extension rather than by zero-fill. If ANY right-hand side operand is unsigned, then ALL widening and arithmetic is unsigned. This behaviour is NOT affected by whether the left-hand side is signed or unsigned.

(2) self-determined expressions. There are some places in Verilog where an expression or operand is isolated - it is in a "self-determined context" - and its width is not affected by, and does not affect, the surrounding expression. All detailed fully in IEEE Std.1364-2005.

There have been many discussions of this issue here. A glance at Steven Sharp's posting history over the past two years will produce some gems.

I'm actually very much leaning towards thinking that Migen should generate the explicit assignments -- or atleast have a flag which enables this behaviour?

@sbourdeauducq Would you be willing to merge it if we added this functionality?

sbourdeauducq commented 6 years ago

The Verilog rules are convoluted and a footgun, but synthesizers implement them correctly and Migen makes use of them (it works similarly to MyHDL in that respect, i.e. http://www.jandecaluwe.com/hdldesign/counting.html).

I'm actually very much leaning towards thinking that Migen should generate the explicit assignments -- or atleast have a flag which enables this behaviour? Would you be willing to merge it if we added this functionality?

Maybe, but you have to think carefully about things such as:

how does it works with signed signals?
does it really help since the current Migen syntax also makes uses of implicit extensions/truncations? Is it consistent?
who will test thoroughly the new code on all ARTIQ core devices and fix regressions?

mithro commented 6 years ago

From the counting page you linked too, it says;

The convertor will generate the required sign extensions, resizings and type castings. It therefore automates this tedious and error-prone bookkeeping task for you.

This is what Migen is doing, correct?

Migen understands the sign extension / resizing / type casting -- it's the human who is reading the generated code from Verilog that doesn't understand it correctly.

So, we are asking Migen to be explicit about the required sign extensions, resizings and type castings to help human readers. If the Verilog implementation is correct, there should be no change in behaviour in the output Verilog.

jordens commented 6 years ago

I am ok with merging something like that because it can be made optional. But my guess is also that readability will actually suffer and that being explicit is complicated. But the code is there and we already implement those rules in the simulator.

whitequark commented 6 years ago

who will test thoroughly the new code on all ARTIQ core devices and fix regressions?

The fact that this question needs to be asked points to serious deficiencies in the design of migen. Namely, why isn't there a testbench that compares the Migen simulator with a Verilog simulator? Should be easy given they both can generate a .vcd.

(And of course, you can generate testcases using a quickcheck-like tool, hypothesis in case of Python.)

sbourdeauducq commented 6 years ago

Even with good tests, there can be unexpected problems followed by long yak-shaving sessions, so this question needs to be asked in any case. For example, I can imagine that Vivado might not optimize well some new explicit sign extension code, and timing gets broken. But yes, good idea.

mithro commented 6 years ago

Just FYI - @cliffordwolf has a tool called vloghammer which he uses to find bugs in various Verilog implementations.

whitequark commented 5 years ago

Triage: fixed in nMigen, which relies on Yosys to emit Verilog. The Yosys verilog backend is heavily tested using vloghammer.

m-labs / migen

Should Migen generate explicit Verilog for mismatched bit vector lengths (rather then relying on implicit Verilog behaviour?) #102