f4pga / f4pga-arch-defs

FOSS architecture definitions of FPGA hardware useful for doing PnR device generation.
https://f4pga.org
ISC License
271 stars 113 forks source link

MMCM primitive generate incorect clock frequencies in hardware #2248

Open WhiteNinjaZ opened 3 years ago

WhiteNinjaZ commented 3 years ago

@acomodi @mkurc-ant @nelsobe I have been running several designs in hardware using the new MMCME2_ADV primitives and have noticed that the timing has been off. I ran a few LED testes with both the PLL and MMCM at what should be 25MHz for both and got the correct results for the PLL but it looks like the MMCM is generating a clock at about 3kHz instead of 25MHz. This is how I instanced both modules:

MMCME2_ADV #(
        .CLKFBOUT_MULT_F(9),
        .CLKIN1_PERIOD(10.0),
        .CLKOUT0_DIVIDE_F(36.0),
        .CLKOUT1_DIVIDE(1),  // included for Veriltator (TMDS clock)
        .DIVCLK_DIVIDE(1)
    ) MMCME2_ADV_inst (
        .CLKIN1(clk),
        .RST(rst),
        .CLKOUT0(clk_pix_unbuf),
        .LOCKED(locked),
        .CLKFBOUT(clk_fb),
        .CLKFBIN(clk_fb),
        /* verilator lint_off PINCONNECTEMPTY */
        .CLKOUT0B(),
        .CLKOUT1(),
        .CLKOUT1B(),
        .CLKOUT2(),
        .CLKOUT2B(),
        .CLKOUT3(),
        .CLKOUT3B(),
        .CLKOUT4(),
        .CLKOUT5(),
        .CLKOUT6(),
        .CLKFBOUTB(),
        .PWRDWN()
        /* verilator lint_on PINCONNECTEMPTY */
    );
PLLE2_ADV #(
        .CLKFBOUT_MULT(9),
        .CLKIN1_PERIOD(10.0),
        .CLKOUT0_DIVIDE(36),
        .CLKOUT1_DIVIDE(1),  // included for Veriltator (TMDS clock)
        .DIVCLK_DIVIDE(1)
    ) PLLE2_ADV_inst (
        .CLKIN1(clk),
        .RST(rst),
        .CLKOUT0(clk_pix_unbuf),
        .LOCKED(locked),
        .CLKFBOUT(clk_fb),
        .CLKFBIN(clk_fb),
        /* verilator lint_off PINCONNECTEMPTY */
        .CLKOUT1(),
        .CLKOUT2(),
        .CLKOUT3(),
        .CLKOUT4(),
        .CLKOUT5()
        /* verilator lint_on PINCONNECTEMPTY */
    );

I found it interesting that the MMCM was off by a factor of about a thousand. It looks like the inputs may be being multiplied by 1000 twice instead of once.

mkurc-ant commented 3 years ago

@WhiteNinjaZ Hmm, weird.

How are you building your designs? Using installed toolchain? Have you tested the MMCM test design(s) built using CMake? Those are under xc/xc7/tests/mmcm.

The MMCM parameters have to be multiplied by some large number and converted to integers because Yosys does not support math operations on real number parameters (it treats them as strings) and some calculations are required to obtain MMCM register values for a given multiplier / divider. But from the design entry point of view you should be able to provide real numbers just like for Vivado

WhiteNinjaZ commented 3 years ago

@mkurc-ant I am using the installed toolchain from symbiflow-examples updated to the latest version. I ran the main mmcm_int_basys3_bottom.v test on hardware and everything seemed to work fine (the clocks generated the correct frequencies). If it will help here are my full design files:

`default_nettype none
`timescale 1ns / 1ps

module top (
    input  wire logic clk_100m,     
    input  wire logic btn_rst,     
    output      logic [2:0] led
    );

    logic clk_fb_mmcm, clk_fb_pll;         
    logic clk_out_mmcm, clk_out_pll;        
    logic locked_mmcm, locked_pll;     
    logic clk_buff_mmcm, clk_buff_pll;    

    MMCME2_ADV #(
        .CLKFBOUT_MULT_F(9),
        .CLKIN1_PERIOD(10.0),
        .CLKOUT0_DIVIDE_F(36),
        .CLKOUT1_DIVIDE(1),  
        .DIVCLK_DIVIDE(1)
    ) MMCME2_ADV_inst (
        .CLKIN1(clk_100m),
        .RST(btn_rst),
        .CLKOUT0(clk_out_mmcm),
        .LOCKED(locked_mmcm),
        .CLKFBOUT(clk_fb_mmcm),
        .CLKFBIN(clk_fb_mmcm),
        .CLKOUT0B(),
        .CLKOUT1(),
        .CLKOUT1B(),
        .CLKOUT2(),
        .CLKOUT2B(),
        .CLKOUT3(),
        .CLKOUT3B(),
        .CLKOUT4(),
        .CLKOUT5(),
        .CLKOUT6(),
        .CLKFBOUTB(),
        .PWRDWN()
    );  

 PLLE2_ADV #(
        .CLKFBOUT_MULT(9),
        .CLKIN1_PERIOD(10.0),
        .CLKOUT0_DIVIDE(36),
        .CLKOUT1_DIVIDE(1),  
        .DIVCLK_DIVIDE(1)
    ) PLLE2_ADV_inst (
        .CLKIN1(clk_100m),
        .RST(btn_rst),
        .CLKOUT0(clk_out_pll),
        .LOCKED(locked_pll),
        .CLKFBOUT(clk_fb_pll),
        .CLKFBIN(clk_fb_pll),
        .CLKOUT1(),
        .CLKOUT2(),
        .CLKOUT3(),
        .CLKOUT4(),
        .CLKOUT5()
    );

    BUFG bufg_clk_pll(.I(clk_out_pll), .O(clk_buff_pll));
    BUFG bufg_clk_mmcm(.I(clk_out_mmcm), .O(clk_buff_mmcm));

    clock_tester #(100000000) T0 (clk_100m, led[0]);
    clock_tester #(25000000) T1 (clk_buff_pll, led[1]);
    clock_tester #(25000000) T2 (clk_buff_mmcm, led[2]);

endmodule
module clock_tester # (COUNT = 25000000)(
    input  wire logic clk,       
    output      logic led   
    );

    logic [31:0] clk_count = 0; 

    always_ff @(posedge clk) begin
        if (clk_count == COUNT)begin
            clk_count <= 0;

            if(led == 0) begin
                led <= 1;
            end
            else begin
                led <= 0;
            end
        end
        else begin
            clk_count <= clk_count + 1;
        end
    end
endmodule

This design works by blinking an LED on for 1 second and then off for 1 second if the count parameter in clock_tester is equal to the number of cycles/sec of the clock input. LED 1 is connect to the base system clock, LED 2 is connected to the PLL output and LED 3 is connected to the MMCM output. Both the MMCM and PLL should operate at 25MHz. I tried setting the count = 25MHz for both the MMCM and PLL at first and the PLL was blinking once per second but the MMCM was not. I messed around with the numbers a bit and 3.3KHz is very close to the correct number for all three LEDs to blink once per second.

My constraints can also be found bellow:

## FPGA Configuration I/O Options
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property CFGBVS VCCO [current_design]

## Board Clock: 100 MHz
set_property -dict {PACKAGE_PIN W5 IOSTANDARD LVCMOS33} [get_ports {clk_100m}];
create_clock -period 10.00 [get_ports {clk_100m}];

## Btns
set_property -dict {PACKAGE_PIN U18 IOSTANDARD LVCMOS33} [get_ports {btn_rst}];

 set_property -dict { PACKAGE_PIN U16   IOSTANDARD LVCMOS33 } [get_ports { led[0] }]; 
 set_property -dict { PACKAGE_PIN E19   IOSTANDARD LVCMOS33 } [get_ports { led[1] }]; 
 set_property -dict { PACKAGE_PIN U19   IOSTANDARD LVCMOS33 } [get_ports { led[2] }];
WhiteNinjaZ commented 3 years ago

@mkurc-ant In Vivado both the MMCM and PLL operate at 25MHz. It is only in symbiflow that the clocks generate strange values.

mkurc-ant commented 3 years ago

@WhiteNinjaZ To debug this you may run the "diff FASM" test. This will use SymbiFlow to generate a bitstream, then disassemble it and convert back to netlist via fasm2bels followed by importing it to Vivado. Next, Vivado will be used to generate a bitstream and finally the two bitstreams will be disassembled (to FASM) and compared. This should show any problems with MMCM multiplier / divider settings.

To do that you need to add a test to CMake tests in symbiflow-arch-defs (see how others are defined there). The target that runs the diff FASM test is named *_vivado_diff_fasm.

mithro commented 3 years ago

@mkurc-ant / @acomodi - I think should probably enable "diff FASM testing" by end users (outside of symbiflow-arch-defs) in the future?

WhiteNinjaZ commented 3 years ago

@mkurc-ant I have conducted two tests: Using vivado to generate a bitstream->use bit2fasm to generate a fasm file->use fasm2bells to generate a reverse verilog file and also doing the same thing on symbiflows side symbiflow->bit->bit2fasm->fasm2bells->reverse.v. The reverse verilog file generated using vivados bitstream works properly so the issue is likely not in the database. I have included both fasm files in this comment as well as the reverse verilog files. symbiflow-fasm.txt vivado-fasm.txt symbiflow_reversed_verilog.txt vivado-reversed-verilog.txt

mkurc-ant commented 3 years ago

@WhiteNinjaZ Plase take look at the MMCM tests which are in symbiflow-arch-defs: https://github.com/SymbiFlow/symbiflow-arch-defs/tree/master/xc/xc7/tests/mmcm. You can add you test alongside them. This will allow you to run the diff FASM test.

GitHub
symbiflow-arch-defs/xc/xc7/tests/mmcm at master · SymbiFlow/symbiflow-arch-defs
FOSS architecture definitions of FPGA hardware useful for doing PnR device generation. - symbiflow-arch-defs/xc/xc7/tests/mmcm at master · SymbiFlow/symbiflow-arch-defs
WhiteNinjaZ commented 3 years ago

@mkurc-ant I have added my test into CI and made a draft PR with the test (see #2293). I figured this would be the best way to run the diff fasm test but let me know if you would prefer me to run the test on my own machine.

nelsobe commented 2 years ago

We have been experimenting to try to understand why some MMCM designs work with Symbiflow and some don't (when all the designs we have tried do work with Vivado).

We have learned that if you include .COMPENSATION ("INTERNAL") or .COMPENSATION ("EXTERNAL") on the MMCME2_ADV primitive when you instantiate it, it works. When you don't, it does not work.

However, the Xilinx documentation https://www.xilinx.com/support/documentation/user_guides/ug472_7Series_Clocking.pdf seems very clear that users are not supposed to set the COMPENSATION parameter - that the tools will do it for you. This suggests that Symbiflow is not setting it correctly.

Looking at https://github.com/SymbiFlow/symbiflow-arch-defs/pull/1729#issue-730282907 it would seem that this may be related to the comment there of "Now that the basic MMCM features and pips are in, maybe it is time to go back and get the compension features working?" from Oct 27, 2020?

We also see from that issue that, at one time, there were missing bitstream bits related to the MMCM. Are there still missing bits?

@mkurc-ant - can you confirm whether there is support for compensation and whether you believe it is working at this point in time? And, are there others who should be brought into this issue?

Here is code for an MMEM that does work, but only with the compensation parameter (and we note that setting it to either EXTERNAL or INTERNAL makes it work).

MMCME2_ADV #( .COMPENSATION ("EXTERNAL"), .CLKFBOUT_MULT_F(9), .CLKIN1_PERIOD(10.0), .CLKOUT0_DIVIDE_F(36), .CLKOUT1_DIVIDE(1), .DIVCLK_DIVIDE(1) ) MMCME2_ADV_inst ( .CLKIN1(clk), .RST(0), .CLKOUT0(clk_out_mmcm), .LOCKED(locked_mmcm), .CLKFBOUT(clk_fb_mmcm), .CLKFBIN(clk_fb_mmcm), .CLKOUT0B(), .CLKOUT1(), .CLKOUT1B(), .CLKOUT2(), .CLKOUT2B(), .CLKOUT3(), .CLKOUT3B(), .CLKOUT4(), .CLKOUT5(), .CLKOUT6(), .CLKFBOUTB(), .PWRDWN() );

nelsobe commented 2 years ago

@mkurc-ant : In looking through the database/artix7 bitstream files I see a number of things:

  1. For MMCME2_ADV I see that COMP.ZHOLD and COMP.Z_ZHOLD seem to be the only bits defined regarding MMCM compensation. Is the COMP property controlled by bits for each of its values or is EXTERNAL or INTERNAL accomplished some other way?

  2. For PLL's I see COMP as well as COMPENSATION properties but once again they all only relate to ZHOLD and Z_ZHOLD.

mkurc-ant commented 2 years ago

@nelsobe From what I remember when working on MMCM and PLL there are no bits that relate one-to-one to the COMPENSATION parameter. The only influence of the parameter on bitstream is when you select ZHOLD and something else than that. That's why there are only the two FASM features COMP.ZHOLD and COMP.Z_ZHOLD.

Currently Symbiflow does not follow the rule mentioned in the Xilinx doc and it does not automatically set the compensation parameter basing on the design / other MMCM configuration settings. Maybe the solution would be just to make eg. INTERNAL a default?

As for the missing bits: that is still an open question. If you could share the design which causes them to be exposed we could work towards identifying and documenting them.

nelsobe commented 2 years ago

@mkurc-ant I am not aware of any bits that are missing. I was just reacting to comments in [https://github.com/SymbiFlow/symbiflow-arch-defs/pull/1729#issue-730282907] which referred to unknown bits. But I can see that those bits are now associated with COMP.Z_ZHOLD. But, are there other bits in that are unknown? How would one find out?

To summarize:

  1. It would seem that the default value of COMP.ZHOLD which is set in the tech mapper works for some designs but not for others. Setting COMPENSATION to either INTERNAL or EXTERNAL seems to work in the designs we have tried (we set it in the Verilog source for our design).
  2. For designs with no COMPENSATION parameter in the Verilog code, examining the FASM files for the Xilinx bitstream vs. the Symbiflow bitstream shows that the Symbiflow bitstream has COMP.Z_ZHOLD=0 and the Vivado bitstream has COMP.Z_ZHOLD=1. So, it would seem Symbiflow's default behavior is different from Xilinx's.
nelsobe commented 2 years ago

@mkurc-ant What are the ramifications you see from modifying the techmapper to have INTERNAL the default COMPENSATION value as you seem to have been recommending it in the comment above?

mkurc-ant commented 2 years ago

@nelsobe I'd say that making a default value for a parameter is not a big deal on its own. Testing the MMCM behavior in multiple scenarios may require a little more effort. And this should involve testing on real hardware.

As for any unknown bits you can't know for sure that there are any unless you take a design, disassemble its bitstream to FASM and see them reported by the disassembly tool.