YosysHQ / nextpnr

nextpnr portable FPGA place and route tool
ISC License
1.31k stars 243 forks source link

Non-semantic changes cause chaotic behavior! (with inout/high-z pins) #282

Closed tux3 closed 2 years ago

tux3 commented 5 years ago

High-z pins when targeting my ice40 seem to behave chaotically, in a particularly hard to pin down manner.
First let me clearly demonstrate the symptoms, and then some more context.

1. "What do you mean, chaos?"

This example sets an io pin to high-impedance under some condition, uses it to read a bit, and we turn on a LED depending on the result. Here the LED will stay off when we read from this pin (we read a 0):

assign io2 = (!cs && COND_1 && COND_2) ? sendbuf[2] : 1'bz;

But the same code refactored with a wire reads a 1! Now the LED will turn on:

wire COND_1_AND_2 = COND_1 && COND_2;
assign io2 = (!cs && COND_1_AND_2) ? sendbuf[2] : 1'bz;

That introducing a variable could cause visible side-effects seems wrong, even for Verilog! I'm assuming this is a bug in nextpnr (or maybe yosys)?

2. Self-contained repro

I tried to reduce the repro code to something small and self-contained, what's left is some cheaply made SPI code that quad-reads a fixed address from the BX's flash chip and lights the LED.

3. Some more context

This code was originally meant to be a simple (if not very good) homemade QSPI module. I have a testbench that works great in AVHDL, but I noticed that on real metal the second bit of every nibble I read is always 0.

So the original symptom is that some inout pin of my QSPI (WP_B in this case) seem to get stuck low when it should be floating (this was especially surprising to me since that pin is pulled-high on both side when left unconfigured!). My other 3 high-z pins seem to work perfectly fine in this case and I read correct data from them.

Some notes:

Sorry for the mountain of text, this bug has been driving me crazy! I hope this makes sense and I didn't just make some obvious mistake, but if something's not clear I'm more than happy to help :grin:

smunaut commented 5 years ago

Could you provide at least a working and a failing build ? (complete with the log output and all outputs of synthesis / pnr).

tux3 commented 5 years ago

Sure, here's the blif/json/asc and the log output for both. The build that reads 0 is definitely failing since the real value it should read is 1. As for the other one, I can't say that it's "working" but it happens to read the right value, even though the code should be identical.

Build where the pin reads 0: build0.zip Build where the pin reads 1: build1.zip

Is this what you were looking for?

smunaut commented 5 years ago

I'm not really sure I understand what you see but there is plenty of things wrong that should be fixed before even trying to dig further :

tux3 commented 5 years ago

Yes, the code is probably terrible, sorry about that. I'm going to fix it. But about the example above, should I expect that because my code is wrong, adding a wire can change behavior? Is that some sort of 'undefined behavior' in my code, or should I consider it a bug in the toolchain?

Either way I'll try with SB_IO, you're right that it's probably the first thing I should try.

smunaut commented 5 years ago

You said that just changing the --freq argument of nextpnr can make it work / not-work right ? Using the exact same .json as input ?

So that would point to just timing / placement affecting the result.

So that probably means that changing the wire just randomly shuffles things in non-determinic way, just like changing the timing budget.

smunaut commented 5 years ago

Also did you try to simulate your code first ? It doesn't even build under iverilog :/

tux3 commented 5 years ago

So that probably means that changing the wire just randomly shuffles things in non-determinic way, just like changing the timing budget.

Right. I wish I could give you a better bug report, but I'm having a hard time finding the root cause because of this.

Also did you try to simulate your code first ? It doesn't even build under iverilog :/

All I have really is testbenches in Active-HDL, my asserts pass, the waveform looks fine (and deterministic...). The code I posted is just a reduced test-case, I think I even left some wire undefined when removing the rest of the code...

tux3 commented 5 years ago

Alright it does work just fine with SB_IO, so it seems the tristate warning is really the issue behind all of this. I'd like to dig down further and try to fix it if I get the time. I still haven't figured out if the problem comes from Yosys or nextpnr, but at least I think I know enough to make some small test cases instead of debugging my bad SPI code now.

Thank you very much for your help. I suppose it's a known issue and I should have paid more attention to the warning, so feel free to close this if you want.