openXC7 / nextpnr-xilinx

Experimental flows using nextpnr for Xilinx devices
ISC License
36 stars 10 forks source link

RAM32M/RAM64M not initialized with INIT_A/INIT_B/INIT_C/INIT_D parameters #28

Closed hansemro closed 2 months ago

hansemro commented 2 months ago

Issue Description

As discovered in https://github.com/openXC7/nextpnr-xilinx/issues/20#issuecomment-1938392685, nextpnr packer was checking the wrong INIT parameter without the underscore and produce fasm result with INIT parameters unset.

Additionally, in the following post in https://github.com/openXC7/nextpnr-xilinx/issues/20#issuecomment-1941347241, I became more concerned after finding INIT pattern discrepancies between Vivado and NextPNR fasm results.

Indeed, with the recently merged LUTRAM over JTAG Primitive Test, it becomes more clear that my initial fix (https://github.com/hansemro/nextpnr-xilinx/tree/fix-ram32m-ram64m-init) would have functional errors due to invalid INIT patterns.

Affects OpenXC7 toolchain 0.8.2 and older.

Steps to Reproduce

[Affects RAM32M+RAM64M] Missing INIT underscore:

  1. Clone primitve-tests repo:
git clone https://github.com/openXC7/primitive-tests.git
  1. Update target parameters in Makefile (if not using SQRL Acorn CLE215(+)).

  2. Build LUTRAM over JTAG test with RAM32M using openXC7 toolchain:

cd primitive-tests/lutram-tests/jtag-test
nix develop github:openXC7/toolchain-nix
make LUTRAM=RAM32M top.pack.json
  1. Observe missing INIT parameters for transformed be.ram32m/* cells in top.pack.json.

[Affects RAM32M only] Invalid INIT patterns after correcting underscore typo

  1. Checkout fix-ram32m-ram64m-init-0.8.2 branch in my nextpnr-xilinx fork:
git clone --recurse-submodules -b fix-ram32m-ram64m-init-0.8.2 https://github.com/hansmero/nextpnr-xilinx.git
  1. Build nextpnr-xilinx:
cd nextpnr-xilinx
mkdir build
cd build
cmake ..
make -j$(nproc)
  1. Update environment variables:
export PATH=$PWD:$PATH
export NEXTPNR_XILINX_DIR=$PWD
export NEXTPNR_XILINX_PYTHON_DIR=$PWD/../xilinx/python
  1. Rebuild test with openXC7 toolchain + fix-ram32m-ram64m-init-0.8.2 branch of nextpnr-xilinx
cd /path/to/primitive-tests/lutram-tests/jtag-test
make LUTRAM=RAM32M clean top.bit
  1. Program the board
make [JTAG_LINK=...] program
  1. Run the following openocd test commands:
openocd -f interface/ADAPTER.cfg \
    -f ./setup.cfg \
    -c "read_lutram_range 0x0 32"
    -c "shutdown"
  1. Compare openocd output against Vivado's RAM32M results in the following:
0xf
0x2d
0x5a
0x2d
0xa5
0x2d
0xf0
0x2d
0xf
0x78
0x5a
0x78
0xa5
0x78
0xf0
0x78
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0

[REFERENCE] Building JTAG over LUTRAM Using Vivado Toolchain

  1. Create build template build-vivado.tcl inside primitive-tests/lutram-tests/jtag-test/ with the following:
# VIVADO TEMPLATE SCRIPT
# Do not run this script directly.
# Run build-vivado.sh with required arguments instead.

create_project -in_memory -part __PART__

set_property default_lib xil_defaultlib [current_project]
set_property target_language Verilog [current_project]

read_verilog -library xil_defaultlib ../top.v
read_xdc top.xdc

synth_design -top top \
    -part __PART__ \
    -verilog_define VIVADO \
    -verilog_define TEST___LUTRAM__
#write_checkpoint -force top.synth.dcp
place_design
#write_checkpoint -force top.place.dcp
phys_opt_design
route_design
#write_checkpoint -force top.route.dcp
write_bitstream -force top.vivado.bit
  1. Create bash script build-vivado.sh inside primitive-tests/lutram-tests/jtag-test/ with the following:
#!/bin/bash

if [ -z $1 ]; then
    echo "Missing Arg1:FAMILY (e.g. kintex7)"
    exit 1
fi

if [ -z $2 ]; then
    echo "Missing Arg2:PART (e.g. xc7k325tffg900-2)"
    exit 1
fi

if [ -z $3 ]; then
    echo "Missing Arg3:BOARD (e.g. kc705)"
    exit 1
fi

if [ -z $4 ]; then
    echo "Missing Arg4:LUTRAM (e.g. RAM64X1D)"
    exit 1
fi

set -euo pipefail

FAMILY="$1"
PART="$2"
BOARD="$3"
LUTRAM="$4"

mkdir -p "${BOARD}_${PART}_${LUTRAM}"
cd "${BOARD}_${PART}_${LUTRAM}"
sed "s/__PART__/${PART}/g" ../build-vivado.tcl > build-vivado.tcl
sed -i "s/__LUTRAM__/${LUTRAM}/g" build-vivado.tcl
test -f "../${BOARD}.xdc" && cp "../${BOARD}.xdc" top.xdc || touch top.xdc
vivado -mode batch -source build-vivado.tcl
bit2fasm --db-root $PRJXRAY_DB_DIR/${FAMILY}/ --part ${PART} top.vivado.bit > top.vivado.fasm
cd ..
  1. Source Vivado (2017.2) environment and generate bitstream with build-vivado.sh:
source /path/to/Xilinx/Vivado/2017.2/settings64.sh
chmod +x ./build-vivado.sh
./build-vivado.sh <FAMILY> <PART> <BOARD> <LUTRAM>

# Example
./build-vivado.sh artix7 xc7a200tfbg484-3 empty RAM32M
  1. Program <BOARD>_<PART>_<LUTRAM>/top.vivado.bit to the board.
hansemro commented 2 months ago

Reverted https://github.com/hansemro/nextpnr-xilinx/commit/dc23177d505fbb2f2e0cf70d560f49f252dd17d2 as I mistakenly used .to_string() which reverses the bit ordering of Property variables.

https://github.com/hansemro/nextpnr-xilinx/commit/262ae4505babcea1062526b11df6fd70abe2b0db reverts back to using .as_bits() function and pushes Property::State:S1/S0 characters instead. This gets us closer to the original INIT pattern as shown below:

0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x78
0xf0
0x78
0xa5
0x78
0x5a
0x78
0xf
0x2d
0xf0
0x2d
0xa5
0x2d
0x5a
0x2d
0xf

Prior (with https://github.com/hansemro/nextpnr-xilinx/commit/dc23177d505fbb2f2e0cf70d560f49f252dd17d2), it looked like the following:

0xf
0x1e
0xa5
0x1e
0x5a
0x1e
0xf0
0x1e
0xf
0xb4
0xa5
0xb4
0x5a
0xb4
0xf0
0xb4
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
hansemro commented 2 months ago

Pushed a fix for above (https://github.com/hansemro/nextpnr-xilinx/commit/7f459c9a18ee109229377c97fd95891b5a696f6c), where I chose to interleave bits in reverse.

hansfbaier commented 2 months ago

Hmm, I get this:

openocd -f digilent-hs2.cfg  -f ./setup.cfg -c "read_lutram_range 0x0 32"  -c "shutdown"
Open On-Chip Debugger 0.11.0+dev-02453-g9c3a4b458 (2023-01-28-14:14)
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "jtag". To override use 'transport select <transport>'.
Info : clock speed 1 kHz
Info : JTAG tap: xc7.tap tap/device found: 0x0362d093 (mfg: 0x049 (Xilinx), part: 0x362d, ver: 0x0)
Warn : gdb services need one or more targets defined
openocd.cfg loaded
0x3
0x0
0x0
0x3
0x3
0x0
0x0
0x0
0x0
0x3
0x0
0x0
0x3
0x3
0x0
0x0
0x0
0x0
0x3
0x0
0x3
0x0
0x3
0x0
0x0
0x3
0x3
0x0
0x3
0x3
0x3
0x0

I pushed my config files to the prmitives-tests repo. I use a generic FT232H adapter, but it should have the same config as the digilent HS2.

hansemro commented 2 months ago

@hansfbaier Thanks for testing! Make sure to build the test with LUTRAM=RAM32M option (it looks like it is building with RAM32X1D default). If it is already being built, please share the nextpnr json build artifacts.

hansfbaier commented 2 months ago

Now I get:

 openocd -f digilent-hs2.cfg  -f ./setup.cfg -c "read_lutram_range 0x0 32"  -c "shutdown"
Open On-Chip Debugger 0.11.0+dev-02453-g9c3a4b458 (2023-01-28-14:14)
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "jtag". To override use 'transport select <transport>'.
Info : clock speed 1 kHz
Info : JTAG tap: xc7.tap tap/device found: 0x0362d093 (mfg: 0x049 (Xilinx), part: 0x362d, ver: 0x0)
Warn : gdb services need one or more targets defined
openocd.cfg loaded
0xf
0x2d
0x5a
0x2d
0xa5
0x2d
0xf0
0x2d
0xf
0x78
0x5a
0x78
0xa5
0x78
0xf0
0x78
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0

I use the hansemro/fix-ram32m-ram64m-init-v2 branch for nextpnr-xilinx json.zip

hansemro commented 2 months ago

Great! It matches Vivado results (feel free to confirm this yourself).

hansfbaier commented 2 months ago

I did not build the vivado version yet.

hansfbaier commented 2 months ago

Shouldn't it show this?

256'hDEADBEEF_0150BAD0_CAFEF00D_0F0FFFFF_01234567_89ABCDEF_FEDCBA98_76543210

I still don't understand most of the code. Had very little time with it yet.

hansfbaier commented 2 months ago

Really cool! Looks like it works!

> read_lutram 0x0
0xf
> read_lutram 0x0
0xf
> write_lutram 0x0 0xff
1
> read_lutram 0x0      
0xff
> write_lutram 0x0 0xee
1
> read_lutram 0x0      
0xee
> write_lutram 0x0 0xaa
1
> read_lutram 0x0      
0xaa
> write_lutram 0x0 0x55
1
> read_lutram 0x0      
0x55
> write_lutram 0x0 0x0 
1
> read_lutram 0x0     
0x0
hansfbaier commented 2 months ago

@hansemro In RAM64M only one nibble seems to work, or am I doing something wrong?

> read_lutram 0x0
0x6
> write_lutram 0x0 0x0
1
> read_lutram 0x0     
0x0
> write_lutram 0x0 0xff
1
> read_lutram 0x0      
0xf
> write_lutram 0x0 0xee
1
> read_lutram 0x0      
0xe
> write_lutram 0x0 0xaa
1
> read_lutram 0x0      
0xa
> write_lutram 0x0 0x55
1
> read_lutram 0x0      
0x5
> 
hansemro commented 2 months ago

In RAM64M only one nibble seems to work

This is expected. RAM64M only has 4 data ports whereas RAM32M has 8.

hansfbaier commented 2 months ago

Ah I understand, great, thanks!

hansemro commented 2 months ago

Shouldn't it show this?

256'hDEADBEEF_0150BAD0_CAFEF00D_0F0FFFFF_01234567_89ABCDEF_FEDCBA98_76543210

I still don't understand most of the code. Had very little time with it yet.

Thanks for asking since I did find some errors while trying to look into unshuffling the values from openocd. In short, INIT_{A:D} parameters should be 64-bits, and I flipped parameter assignments (INIT_D<->INIT_A; INIT_C<->INIT_B).

Because I did mess up how INIT patterns were assigned, let me share first how we can readback the original lower 128-bits of the INIT using RAM32M.

Focusing on ports and parameters with the _D suffix in https://github.com/openXC7/primitive-tests/blob/07430a895ae9b5b804acbf4e0cd060e33662be57/lutram-tests/jtag-test/lutram_inst.vh#L270-L291, we can see that DOD[1:0] maps to lutram_do[1:0] and that INIT_D maps to INIT[127:96]=32'h01234567=32'b00_00_00_01_00_10_00_11_01_00_01_01_01_10_01_11 [*]. So let's try reading this back first.

Reading address 5'h0, lutram_do[1:0] fetches INIT[97:96]=2'b11. Reading address 5'h1, lutram_do[1:0] fetches INIT[99:98]=2'b01. and so on until address 5'hF, where lutram_do[1:0] fetches INIT[127:126]=2'b00.

lutram_do[0] reads back every even bits of INIT_D (starting at bit 0) and lutram_do[1] reads back every odd bits (starting at bit 1)

The same principle applies to memory groups A, B, and C.

[*] Two mistakes made here: (1) INIT_{A:D} are 64-bits and not 32-bits; (2) Flipped parameter assignments: INIT_D swapped with INIT_A and INIT_C swapped with INIT_B.

Reading INIT_D in OpenOCD

source ./setup.cfg

# Read and RETURN data from LUTRAM at given address
# Arg:
#  - address: 8-bit address
proc read_lutram_v2 {address} {
    global user3_instr
    global user3_width
    global user4_instr
    global user4_width
    set address [expr {$address & 0xff}]
    set instr [expr {$address << 2}]
    # Send instruction while USER3 is loaded
    irscan xc7.tap $user3_instr
    drscan xc7.tap $user3_width $instr
    # Read reply while USER4 is loaded
    irscan xc7.tap $user4_instr
    set ret [string cat 0x [drscan xc7.tap $user4_width 0]]
    set ret [expr {$ret >> 8}]
    return $ret
}

# Read INIT_D
for {set x 0} {$x < 32} {set x [expr {$x + 1}]} {
    set ret [read_lutram_v2 $x]
    set d1 [expr {($ret & 0x1)}]
    set d2 [expr {($ret & 0x2) >> 1}]
    # print even bits of INIT_D pattern
    echo [format 0x%x $d1]
    # print odd bits of INIT_D pattern
    echo [format 0x%x $d2]
}

Openocd output:

0x1
0x1
0x1
0x0
0x0
0x1
0x1
0x0
0x1
0x0
0x1
0x0
0x0
0x0
0x1
0x0
0x1
0x1
0x0
0x0
0x0
0x1
0x0
0x0
0x1
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
0x0
hansemro commented 2 months ago

https://github.com/openXC7/primitive-tests/pull/2 changes INIT_{A:D} parameters, so here is the new expected RAM32M vivado result:

0x3c
0x3c
0x79
0xfc
0xb6
0xbc
0xf3
0xbc
0x3c
0x39
0x49
0x49
0x76
0x39
0x3
0x9
0xdc
0xf6
0x89
0xc6
0x86
0xc6
0xf3
0xb6
0x6c
0xf3
0xb9
0xb3
0xa6
0xe3
0x43
0xf3
0

Fix by #29 is still valid (thankfully).