chili-chips-ba / openCologne

Spicing up the first and only EU FPGA chip with a flashy new board, loaded with a suite of engaging demos and examples. https://www.chili-chips.xyz/open-cologne
https://nlnet.nl/project/openCologne
BSD 3-Clause "New" or "Revised" License
46 stars 2 forks source link

Undefined behavior in synthesis depending on the bit width of variables #25

Open tarik-ibrahimovic opened 4 months ago

tarik-ibrahimovic commented 4 months ago

In the provided example if you change the bit width of counter from 6 to any other number the design stops working, UART starts falling out of sync and the reads from the memory are incorrect.

This example is supposed to write to a random location in a PCB (Olimex) mounted PSRAM, a random set of bytes, then read from the same location.

How to recreate this issue

Run build by typing

cd 3.build
make hw_all

You can see the results of writing to memory and reading from memory by running the python script, located in 4.testing, after building and uploading the bitstream. You should see something like this: image

Then change the bit width of counter on line 73 where the comment is:

   logic [5:0] counter; //**change this to 5 or 4 bits, stops working although the MSB isn't used for anything

and run the python script again, you should see something like this: image

Since there is no PSRAM verilog model in this particluar case, all the things that could have been verified through simulation have been. Nevertheless, changing the bit width shouldn't in any way have an impact like this one

pu-cc commented 3 months ago

I don't have a board right now to test it, but you should make sure to reset all counter bits, not only counter[0]:

https://github.com/chili-chips-ba/openCologne/blob/4aeeacb473494dd2ef7574faa5dfb1ff508363b1/2.Simple--1--PSRAM/1.hw/top.sv#L78

chili-chips-ba commented 3 months ago

That line will reset all counter bits, irrespective of how many they are.

pu-cc commented 3 months ago

The resulting procedural block after conversion to top.v is counter <= 1'sb0;:

    always @(posedge clk_out or negedge arst_n)
        if (arst_n == 1'b0)
            counter <= 1'sb0;
        else if (counter == 9) begin
            tick_1us_reg <= 1'b1;
            counter <= 0;
        end
        else begin
            tick_1us_reg <= 1'b0;
            counter <= counter + 1;
        end

Synthesis seems to interpret this correctly. I didn't check this at first.

Then please provide a suitable test bench so that we can find what is supposed to be wrong.

tarik-ibrahimovic commented 3 months ago

This issue can be attributed to the precision of the instantiated UART, latest commit solves it with increasing the precision of the UART. In the same repo folder, there is now a PSRAM simulation model which you can use to verify the operations of the design.

If you take a look at RTL simulation and post-PnR simulation there is no difference, but for some reason the timing of the uart_tx has been changed when changing the bit width - this is deduced by observing the read data on the serial port, which is almost always halfway correct when changing bit widths. This may be a consequence of a deeper issue mentioned here.

pu-cc commented 3 months ago

If you take a look at RTL simulation and post-PnR simulation there is no difference, [...]

It seems that some things are still missing so that I can simulate it properly. Can you please check that the files are up to date? The makefile has no targets for any simulation. The testbench complains about conflicting drivers on the inout ports. I'd like to see the same results in the simulation first, as you do, then I can get a better picture of it. Thanks!

tarik-ibrahimovic commented 3 months ago

I just pulled the repo clean and run all the steps listed here to simulate. Here's what I got for the RTL, post-synth, and post-PnR sim, in the listed order, top to bottom. Essentially the uart_tx transmits 0x0708 if it runs successfully. image

If you encounter no simulation targets in the makefile that is because the simulation is done in 2.sim as listed in the steps, and 3.build is just for synthesis and PnR.

Also the simulations are now all done in Verilator, and iverilog is not currently supported, which solves the "conflicting drivers on the inout ports".

pu-cc commented 3 months ago

Also the simulations are now all done in Verilator, and iverilog is not currently supported, which solves the "conflicting drivers on the inout ports".

Why change the simulator to avoid the error messages?

For a more detailed analysis/co-simulation on our side, at least the post-synthesis and post-implementation netlists should be simulatable by iverilog. I need its features such as timing annotation.

chili-chips-ba commented 2 months ago

@pu-cc any experience or example you can offer about simulating inout ports and multi-drop, tri-state buses with iverilog?

Tarik's example above simulates fine with commercial tools on EDA Playground and, surprisingly, even with nominally 2-state Verilator.

chili-chips-ba commented 2 months ago

@DadoCCAG please consider adding Verilator to GateMate simulation stack, and advise if you need anything else from Tarik in order to address this issue.

pu-cc commented 2 months ago

any experience or example you can offer about simulating inout ports and multi-drop, tri-state buses with iverilog?

I don't understand that. It works very well, and if you need an example, please take a look at e.g. this testbench with spiflash, whose ports are inout: https://github.com/pu-cc/picorv32/tree/gatemate-demo/picosoc

tarik-ibrahimovic commented 2 months ago

@pu-cc we found the bug in the sim model, and now iverilog runs fine for the RTL and post-synthesis simulation, but it stays at 0 ticks in post-pnr sim, which is likely due to the way the sim models are written. Still I get no error so I don't know what is the exact reason, but from my side, RTL sim and post-synth sim work both with verilator and iverilog now.

chili-chips-ba commented 2 months ago

Is this stuck-at-time-0 post-PNR sim using 0-delay mode? Have you also tried the back-annotated SDF mode?

Does Icarus have an option for unit delay sim? If so, it's also worth trying...

pu-cc commented 2 months ago

@tarik-ibrahimovic this is great news that can help me with debugging. Regarding the stuck post-implementation simulation, that just crossed my mind, have you checked that the OUT_CLK parameter in CC_PLL is not zero? See cpelib.v.

tarik-ibrahimovic commented 2 months ago

Sorry, I forgot to update the tools to the latest version it works fine now. To continue with solving the problem: as with verilator, RTL sim, post-synth sim, and post-pnr sim are matching, and look like the inserted figure below.

Previous problem was: when I change the counter bit width from 6 bits to anything else, I get a non-functioning design. Now with the new tools the design doesn't work in any way I try, shown in the image below.

The post-PnR sim and real chip operation are mismatching.

image

image

chili-chips-ba commented 2 months ago

@tarik-ibrahimovic have you checked tool-reported Fmax against the actual clock applied to the board, per this?

If there are timing-closure issue, a back-annotated PNR sim should reveal them...

pu-cc commented 2 months ago

Thank you @tarik-ibrahimovic! I can also reproduce the behavior here. I'm working on it.

@chili-chips-ba worst corner timing should be at least 60 MHz, according to my logs. So that shouldn't be any problem.

chili-chips-ba commented 1 month ago

Hi @pu-cc, what's the outlook for this bug fix, given that the related issue#33 was closed?