Closed sy2002 closed 4 years ago
@MJoergen As "promised" in the pull request's comment: Here is the ISE vs. Vivado riddle: (Github Link:) https://github.com/sy2002/QNICE-FPGA/issues/4
Interesting. Has this ever worked on ISE previously ? If not, does anything related to this module work? E.g. I see you have commented-out code returning fixed constants, does that work ?
I don't have an immediate answer, so this will require some investigation. My experience with ISE is very limited, but I'm sure there is a reasonable explanation hiding somewhere.
EDIT: It's interesting that apparently all reads return FFFC. That could (maybe) be a clue.
Thank you for your fast feedback and your questions.
Here are the answers:
In general, QNICE in the develop branch synthesized with ISE works fine: VGA, SD Card, Extended Arithmetic Element (EAE), keyboard, UART, ... ...everything seems to be OK
Returning fixed constants also did not work, i.e. it returned FFFC (perfectly reproducable: it is always FFFC)
Inspired by your question "has this ever worked" I did the following experiments
So I checked out the master branch.
The difference between V1.5 (master) and the develop branch is: In past, we only had a Cycle-Counter and now we additionally also wanted an Instruction-Counter. So Between master and develop I refactored the counter: New version (branch develop):
https://github.com/sy2002/QNICE-FPGA/blob/develop/vhdl/cycle_counter.vhd
Old version (branch master):
https://github.com/sy2002/QNICE-FPGA/blob/master/vhdl/cycle_counter.vhd
So for answering your question
Has this ever worked on ISE previously ?
I checked out the old master branch, synthesized with ISE and yes, it works, because I got this result:
Amount of cycles used: 000000000041
Amount of instructions performed: FFFFFFFFFFFF
This is correct, because the old version on branch master did not have yet an Instruction counter, but only a cycle counter.
So the answer to your question is: The old VHDL code worked in the old master branch on ISE. Yes.
So I went back to the develop branch for Experiment 2 and Experiment 3: Both use the new code.
I now wondered: OK, what if I commented out the instruction counter but still leave the cycle counter in, and use the new code on develop.
-- cycle counter
cyc : cycle_counter
port map (
clk => SLOW_CLOCK,
impulse => SLOW_CLOCK,
reset => reset_ctl,
en => cyc_en,
we => cyc_we,
reg => cyc_reg,
data => cpu_data
);
-- -- instruction counter
-- ins : cycle_counter
-- port map (
-- clk => SLOW_CLOCK,
-- impulse => cpu_ins_cnt_strobe,
-- reset => reset_ctl,
-- en => ins_en,
-- we => ins_we,
-- reg => ins_reg,
-- data => cpu_data
-- );
Result: WOOOW! The cycle counter works and the output is identical to experiment 1
Well, just like Experiment 2 but the other way round: Comment out the cycle counter and leave the instruction counter in:
Works like a charm and as expected:
Amount of cycles used: FFFFFFFFFFFF
Amount of instructions performed: 000000000012
(note that in experiment 2 and 3 it is FFFFFFFFFFFF and not FFFCFFFCFFFC)
While on Vivado, everything works as expected, on ISE it only works, if I either have commented-out the cycle counter (then the instruction counter works perfectly) or if I have commented-out the instruction counter (then the cycle counter works perfectly). :dizzy_face:
Thank you for your detailed experiments. I tried a few experiments myself (with ISE), including this one:
In the file vhdl/hw/nexys4ddr/env1.vhd replace
en => cyc_en,
we => cyc_we,
with
en => '0',
we => '0',
This should only affect the cycle counter, and the instruction counter should return 12. However, this test fails too, and both values are incorrect.
It is my preliminary conclusion (based on your and my analysis; including that the design works on Vivado) that this is a bug in ISE. My hypothesis is that ISE (under certain conditions) is incorrectly modelling the tristate registers.
I don't have much experience using tristate registers in a design (I didn't even know it was possible!), but looking at the schematic of the generated netlist it seems that ISE in general replaces all tristate buffers by ordinary digital logic. This is to be expected because the FPGA itself does not contain registers with tristate output. And apparently, in some cases ISE makes a mistake in the generated logic. I have not been able to specifically pinpoint exactly the error, so for now this is just a hypothesis.
ISE 14.7 does have a few known issues (see here: https://www.xilinx.com/support/answers/47717.html), but none that I can relate to this problem.
I have tried to find a workaround, i.e. any reasonably small change that could make the code work on ISE, but so far found none.
One option I have not yet tried is to remove all uses of tristate registers in the design. In particular, this involves replacing the DATA signal on the CPU with two separate signals DATA_IN and DATA_OUT. Most of the peripheral modules need updating as well, so this change will affect a lot of files in the design (albeit with fairly small changes). I would expect this change to work, since ISE would no longer have to deal with any tristate buffers.
If this last option is seen to work, is this a way forward, do you think? Or will it interfere with the other modifications you have planned?
Thank you for this very thorough analysis.
I need to admit, that I feel strongly emotionally attached to this nice and (in my eyes) extremely elegant DATA
bus. Just as you described it in another comment: When watching the schematics, everything looks like a "real computer on a breadboard" and all the peripheral devices "just connect to this bus". This fits to the philosophy of QNICE - being QUITE NICE ;-)
Look at the think blue line, how the data bus connects everything nicely :-)
Having a separate DATA_IN and DATA_OUT is something that I would need to get used to.
But if you say, that this is the more "FPGA-ish" way to do things, I guess I can first get used to it (while still being in grief) and then after a while be able to live with it and even later, forget the grief and be happy with it again. :sweat_smile:
@bernd-ulmann What do you think?
@both of you: Right now, I am improving the CPU in the branch dev-int
while we are developing QNICE's interrupt system. So at this very moment, this change is not yet a good idea for code-merging reasons.
I suggest, that I do some final (desperate) experiments by myself, i.e. ask google, play with the synthesis settings of ISE, etc. and if nothing helps, then (after we merged the branch dev-int
back to develop
), I come back to you Michael: You could then either explain me how to do it, or just do it and change everything to DATA_IN and DATA_OUT.
I like the idea of the central bi-direction data bus but that is mostly due to the fact that I grew up tinkering with TTL logic. :-) I have to admit that separate I/O-busses would be more efficient and, I think, more FPGA-like.
The current design looks very nice in a graphical representation but if there is a better way to do it we should go that route! :-)
There is one last (desperate?) option, and that is to abandon ISE! Seeing that ISE is no longer supported by Xilinx, and that Vivado is (in my opinion) a much better tool, I don't quite see the need for supporting ISE.
There is another added benifit of abandoning ISE and that is to allow using VHDL 2008 (since this is supported by Vivado, but not by ISE). The new version of VHDL allows for some code simplification, e.g. using fewer casts, and in some cases fewer signals.
So by abandoning ISE we get to keep the bidirectional data bus and allow a more efficient coding style using VHDL 2008. Is that an option ?
That sounds reasonable to me. :-)
At this point I do not want to abandom ISE. So in version V1.7 doing the "Tristate Refactor" as described in issue #30 might be the solution for this. If not, then we still can abandom ISE.
Indeed the Tristate refactoring solved this 🎉 🚀 😄
I tested the ISE bitstream using test_programs/ise.asm
at first and then my heavy duty multi-tasking smoke test: PASSED.
Closing the issue now.
Description of the Problem
I added a hardware instruction counter and a hardware cycle counter to QNICE-FPGA, so that we can do performance tests. Here is a simple and minimal test program:
https://github.com/sy2002/QNICE-FPGA/blob/develop/test_programs/ise.asm
What happens:
When synthesizing with Vivado, the correct values are being output from the test program. When synthesizing using ISE, complete rubbish is being outout.
Here is the hardware:
https://github.com/sy2002/QNICE-FPGA/blob/develop/vhdl/cycle_counter.vhd
Reproduction Instruction
For being on the safe side: Clone a fresh QNICE-FPGA repo, switch to the branch develop (git checkout develop). (Not to be mixed up with dev-int. It needs to be develop)
Compile the toolchain: Run
tools/make-toolchain.sh
Compile the monitor: Run
monitor/compile_and_distribute.sh
Compile the PORE ROM: Run
pore/compile_pore.sh
Go to
test_programs
and run../assembler/asm ise.asm
. You will need the resulting.out
file in the following experiment.Vivado:
Use
hw/xilinx/nexys4ddr/Vivado/qnice_nexys.xpr
to synthesize a bitstream: All good, incl. timing closure.Start it on hardware
Run the above-mentioned ise.out on hardware
Everything will work like a charm and you will see a hexadecimal output like this:
ISE:
Use
hw/xilinx/nexys4ddr/ISE/env1.xise
to synthesize a bitstream: All will work fine, too. Incl. timing closure.Start it on hardware
Run the above-mentioned ise.out on hardware
WOW. Nothing works. You will see an output like
What I already tried
Change the speed grade of the Nexys in ISE
Cut the clk speed (50 MHz) in half (25 MHz)
Read multiple times from the same register to check, if there is a timing problem (doesn't seem to be)
Changed (just for testing purposes and reverted back immediatelly) the register bank to synchroneous BRAM. This is in general not a good idea, because then the cpu_test.asm fails and the whole architecture of the CPU would need to be changed, but I just wanted to see, if this might be the reason.