Open JoyBed opened 7 months ago
Well... When I limit it in the dts to 512mb then the linux can boot but the biggest problem I always had is NaxRiscv. The main_ram works on ANY softcore but not on the NaxRiscv, here is the screenshot: Thats when its connected to main ram thru peripheral bus, when I connect the main ram directly to the axi4 ports it gets stuck at the "memtest at 0x40000000" and doesnt even count.
Hi,
I just tested via : litex_sim --cpu-type=naxriscv --with-sdram --sdram-module=MT41K128M16 --sdram-data-width=64
I got :
--=============== SoC ==================--
CPU: NaxRiscv 32-bit @ 1MHz
BUS: wishbone 32-bit @ 4GiB
CSR: 32-bit data
ROM: 128.0KiB
SRAM: 8.0KiB
L2: 8.0KiB
SDRAM: 1.0GiB 64-bit @ 8MT/s (CL-6 CWL-5)
MAIN-RAM: 1.0GiB
--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Switching SDRAM to hardware control.
Memtest at 0x40000000 (8.0KiB)...
Write: 0x40000000-0x40002000 8.0KiB
Read: 0x40000000-0x40002000 8.0KiB
Memtest OK
Memspeed at 0x40000000 (Sequential, 8.0KiB)...
Write speed: 2.9MiB/s
Read speed: 3.4MiB/s
--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
Cancelled
--============= Console ================--
litex> mem_test 0x40000000 0x1000
Memtest at 0x40000000 (4.0KiB)...
Write: 0x40000000-0x40001000 4.0KiB
Read: 0x40000000-0x40001000 4.0KiB
Memtest OK
litex> mem_test 0x50000000 0x1000
Memtest at 0x50000000 (4.0KiB)...
Write: 0x50000000-0x50001000 4.0KiB
Read: 0x50000000-0x50001000 4.0KiB
Memtest OK
litex> mem_test 0x60000000 0x1000
Memtest at 0x60000000 (4.0KiB)...
Write: 0x60000000-0x60001000 4.0KiB
Read: 0x60000000-0x60001000 4.0KiB
Memtest OK
litex> mem_test 0x70000000 0x1000
Memtest at 0x70000000 (4.0KiB)...
Write: 0x70000000-0x70001000 4.0KiB
Read: 0x70000000-0x70001000 4.0KiB
Memtest OK
What command line did you used ? One specific thing about NaxRiscv, is that the CPU is generated with the exact knowledge of "where is some ram i can access". So if there is a bug there it could create your case.
I used this: ./xilinx_zybo_z7_20.py --variant=original --cpu-type=naxriscv --xlen=64 --scala-args='rvc=true,rvf=true,rvd=true,alu-count=2,decode-count=2,mmu=true' --with-fpu --with-rvc --with-ps7 --bus-standard=axi-lite --with-spi-sdcard --sys-clk-freq=125e6 --with-xadc --csr-json zybo.json --uart-baudrate=2000000 --build --update-repo=wipe+recommended --vivado-synth-directive=PerformanceOptimized --vivado-route-directive=AggressiveExplore --with-hdmi-video-framebuffer --l2-bytes=262144 --l2-ways=16 --with-jtag-tap When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000, when I generate it with DRAM connected to pbus than I have those data errors as in the screenshot.
when I generate it with DRAM connected to pbus
Hmm should realy not do that, the Nax SoC is realy intended to use cacheable memory through mbus. Things through pbus may not support atomic access and stuff like that.
When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000,
Ahhh that is one thing. Probably related to the zynq nature of the FPGA. Not sure if there is a way to setup a simulation of the SoC / zynq with vivado ?
@JoyBed the DRAM present on zybo board is connected to the PS -> you can't use it from PL
when I generate it with DRAM connected to pbus
Hmm should realy not do that, the Nax SoC is realy intended to use cacheable memory through mbus. Things through pbus may not support atomic access and stuff like that.
When I generate a NaxRiscv with mbus thats connected to DRAM than it freezes at mem_test at 0x40000000,
Ahhh that is one thing. Probably related to the zynq nature of the FPGA. Not sure if there is a way to setup a simulation of the SoC / zynq with vivado ?
I dont know, I can launch simulation within Vivado but thats not helping much as I cant connect to the UART. At least i dont know about a way to do that.
@JoyBed the DRAM present on zybo board is connected to the PS -> you can't use it from PL
Actually you can thru the slave ports of the PS7 system, the HP slave ports are connected directly to DRAM. I am using that. Its working with EVERY softcore in the Litex except the NaxRiscv. Here you can see my target file:
#!/usr/bin/env python3
#
# This file is part of LiteX-Boards.
#
# Copyright (c) 2019-2020 Florent Kermarrec <florent@enjoy-digital.fr>,
# Copyright (c) 2022-2023 Oliver Szabo <16oliver16@gmail.com>
# SPDX-License-Identifier: BSD-2-Clause
import math
import os
from migen import *
from litex.gen import LiteXModule
from litex_boards.platforms import digilent_zybo_z7_20
from litex.soc.interconnect import axi
from litex.soc.interconnect import wishbone
from litex.soc.cores.clock import *
from litex.soc.integration.soc_core import *
from litex.soc.integration.builder import *
from litex.soc.cores.video import VideoVGAPHY
from litex.soc.cores.video import VideoS7HDMIPHY
from litex.soc.cores.usb_ohci import USBOHCI
from litex.soc.cores.led import LedChaser
from litex.soc.cores.xadc import XADC
from litex.soc.cores.dna import DNA
from litex.soc.integration.soc import SoCRegion
from litex.soc.interconnect import csr_eventmanager
from litex.soc.interconnect.csr_eventmanager import EventManager, EventSourceLevel, EventSourcePulse
from litex.soc.interconnect.csr import AutoCSR
from litex.soc.cores import cpu
# CRG ----------------------------------------------------------------------------------------------
class _CRG(LiteXModule):
def __init__(self, platform, sys_clk_freq, toolchain="vivado", use_ps7_clk=False, with_video_pll=False, with_usb_pll=False):
self.rst = Signal()
self.cd_sys = ClockDomain()
self.cd_vga = ClockDomain()
self.cd_hdmi = ClockDomain()
self.cd_hdmi5x = ClockDomain()
self.cd_usb = ClockDomain()
# # #
# Clk
clk125 = platform.request("clk125")
if use_ps7_clk:
self.comb += ClockSignal("sys").eq(ClockSignal("ps7"))
self.comb += ResetSignal("sys").eq(ResetSignal("ps7") | self.rst)
else:
# MMCM.
#if toolchain == "vivado":
# self.mmcm = mmcm = S7MMCM(speedgrade=-2)
#else:
# self.mmcm = mmcm = S7PLL(speedgrade=-2)
self.mmcm = mmcm = S7MMCM(speedgrade=-2)
#self.mmcm = mmcm = S7PLL(speedgrade=-1)
#self.comb += mmcm.reset.eq(self.rst)
mmcm.register_clkin(clk125, 125e6)
mmcm.create_clkout(self.cd_sys, sys_clk_freq)
platform.add_false_path_constraints(self.cd_sys.clk, mmcm.clkin) # Ignore sys_clk to mmcm.clkin path created by SoC's rst.
mmcm.expose_drp()
self.comb += mmcm.reset.eq(mmcm.drp_reset.re | self.rst)
# Video PLL.
if with_video_pll:
self.video_pll = video_pll = S7PLL(speedgrade=-2)
self.comb += video_pll.reset.eq(self.rst)
video_pll.register_clkin(clk125, 125e6)
#video_pll.create_clkout(self.cd_vga, 40e6)
video_pll.create_clkout(self.cd_hdmi, 148.5e6)
video_pll.create_clkout(self.cd_hdmi5x, 5*148.5e6)
platform.add_false_path_constraints(self.cd_sys.clk, video_pll.clkin) # Ignore sys_clk to video_pll.clkin path created by SoC's rst.
# USB PLL
if with_usb_pll:
mmcm.create_clkout(self.cd_usb, 48e6)
# BaseSoC ------------------------------------------------------------------------------------------
class BaseSoC(SoCCore):
mem_map = {**SoCCore.mem_map, **{
#"usb_ohci": 0xc0000000,
"usb_ohci": 0x18000000,
}}
def __init__(self, sys_clk_freq=100e6,
variant = "original",
toolchain="vivado",
with_ps7 = False,
with_dna = False,
with_xadc = False,
with_usb_host=False,
with_led_chaser = False,
with_video_terminal = False,
with_video_framebuffer = False,
with_hdmi_video_terminal = False,
with_hdmi_video_framebuffer = False,
**kwargs):
self.interrupt_map = {
"ps" : 2,
}
platform = digilent_zybo_z7_20.Platform(variant=variant)
self.builder = None
self.with_ps7 = with_ps7
# CRG --------------------------------------------------------------------------------------
use_ps7_clk = (kwargs.get("cpu_type", None) == "zynq7000")
with_video_pll = (with_hdmi_video_terminal or with_hdmi_video_framebuffer)
with_usb_pll = with_usb_host
self.crg = _CRG(platform, sys_clk_freq, use_ps7_clk, with_video_pll = with_hdmi_video_terminal or with_video_terminal or with_hdmi_video_framebuffer or with_video_framebuffer, with_usb_pll = with_usb_host)
# SoCCore ----------------------------------------------------------------------------------
if kwargs["uart_name"] == "serial":
kwargs["uart_name"] = "usb_uart" # Use USB-UART Pmod on JB.
if kwargs.get("cpu_type", None) == "zynq7000":
kwargs["integrated_sram_size"] = 0x0
kwargs["with_uart"] = False
self.mem_map = {
'csr': 0x4000_0000, # Zynq GP0 default
}
SoCCore.__init__(self, platform, sys_clk_freq, ident="LiteX SoC on Zybo Z7/original Zybo", **kwargs)
# USB Host ---------------------------------------------------------------------------------
if with_usb_host:
self.submodules.usb_ohci = USBOHCI(platform, platform.request("usb_host"), usb_clk_freq=int(48e6))
self.bus.add_slave("usb_ohci_ctrl", self.usb_ohci.wb_ctrl, region=SoCRegion(origin=self.mem_map["usb_ohci"], size=0x100000, cached=False))
#self.bus.add_slave("usb_ohci_ctrl", self.usb_ohci.wb_ctrl)
self.dma_bus.add_master("usb_ohci_dma", master=self.usb_ohci.wb_dma)
self.comb += self.cpu.interrupt[16].eq(self.usb_ohci.interrupt)
# Zynq7000 Integration ---------------------------------------------------------------------
if kwargs.get("cpu_type", None) == "zynq7000":
self.cpu.use_rom = True
if variant in ["z7-10", "z7-20", "original"]:
# Get and set the pre-generated .xci FIXME: change location? add it to the repository? Make config
os.makedirs("xci", exist_ok=True)
os.system("wget https://github.com/litex-hub/litex-boards/files/8339591/zybo_z7_ps7.txt")
os.system("mv zybo_z7_ps7.txt xci/zybo_z7_ps7.xci")
self.cpu.set_ps7_xci("xci/zybo_z7_ps7.xci")
else:
self.cpu.set_ps7(name="ps", config = platform.ps7_config)
# Connect AXI GP0 to the SoC with base address of 0x40000000 (default one)
wb_gp0 = wishbone.Interface()
self.submodules += axi.AXI2Wishbone(
axi = self.cpu.add_axi_gp_master(),
wishbone = wb_gp0,
base_address = 0x40000000)
self.bus.add_master(master=wb_gp0)
#TODO memory size dependend on board variant
self.bus.add_region("sram", SoCRegion(
origin = self.cpu.mem_map["sram"],
size = 512 * 1024 * 1024 - self.cpu.mem_map["sram"])
)
self.bus.add_region("rom", SoCRegion(
origin = self.cpu.mem_map["rom"],
size = 256 * 1024 * 1024 // 8,
linker = True)
)
self.constants["CONFIG_CLOCK_FREQUENCY"] = 666666687
self.bus.add_region("flash", SoCRegion(
origin = 0xFC00_0000,
size = 0x4_0000,
mode = "rwx")
)
# PS7 as Slave Integration ---------------------------------------------------------------------
elif with_ps7:
cpu_cls = cpu.CPUS["zynq7000"]
zynq = cpu_cls(self.platform, "standard") # zynq7000 has no variants
zynq.set_ps7(name="ps", config = platform.ps7_config)
#axi_M_GP0 = zynq.add_axi_gp_master()
#self.bus.add_master(master=axi_M_GP0)
axi_S_HP0 = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
axi_S_HP1 = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
axi_S_HP2 = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
axi_S_HP3 = zynq.add_axi_hp_slave(clock_domain = self.crg.cd_sys.name)
axi_S_GP0 = zynq.add_axi_gp_slave(clock_domain = self.crg.cd_sys.name)
hp_ports = [axi_S_HP0, axi_S_HP1, axi_S_HP2, axi_S_HP3]
# PS7 DDR3 Interface -----------------------------
ddr_addr = self.cpu.mem_map["main_ram"]
#map_fct_ddr = lambda sig : sig - ddr_addr + 0x0008_0000
map_fct_ddr = lambda sig : sig - ddr_addr + 0x0010_0000
sdram_size = 0x4000_0000
if hasattr(self.cpu, "add_memory_buses"):
self.cpu.add_memory_buses(address_width = 32, data_width = 64)
if len(self.cpu.memory_buses): # if CPU has dedicated memory bus
print("--------Connecting DDR to direct RAM port of the softcore using HP bus.--------")
for mem_bus in self.cpu.memory_buses:
i = 0
axi_ddr = axi.AXIInterface(hp_ports[i].data_width, hp_ports[i].address_width, "byte", hp_ports[i].id_width)
self.comb += axi_ddr.connect_mapped(hp_ports[i], map_fct_ddr)
data_width_ratio = int(axi_ddr.data_width/mem_bus.data_width)
print("Connecting: ", str(mem_bus), " to ", str(axi_ddr))
print("CPU memory bus data width: ", mem_bus.data_width, " bits")
print("DDR bus data width: ", axi_ddr.data_width, " bits")
print("CPU memory bus address width: ", mem_bus.address_width, " bits")
print("DDR bus address width: ", axi_ddr.address_width, " bits")
print("CPU memory bus id width: ", mem_bus.id_width, " bits")
print("DDR bus id width: ", axi_ddr.id_width, " bits")
# Connect directly
if data_width_ratio == 1:
print("Direct connection")
self.comb += mem_bus.connect(axi_ddr)
# UpConvert
elif data_width_ratio > 1:
print("UpConversion")
axi_port = axi.AXIInterface(data_width = axi_ddr.data_width, addressing="byte", id_width = len(mem_bus.aw.id))
self.submodules += axi.AXIUpConverter(axi_from = mem_bus, axi_to = axi_port,)
self.comb += axi_port.connect(axi_ddr)
# DownConvert
else:
print("DownConversion")
axi_port = axi.AXIInterface(data_width = axi_ddr.data_width, addressing="byte", id_width = len(mem_bus.aw.id))
self.submodules += axi.AXIDownConverter(axi_from = mem_bus, axi_to = axi_port,)
self.comb += axi_port.connect(axi_ddr)
i = i + 1
# Add SDRAM region
origin = None
main_ram_region = SoCRegion(
origin = self.mem_map.get("main_ram", origin),
size = sdram_size,
mode = "rwx")
self.bus.add_region("main_ram", main_ram_region)
else:
print("--------Connecting DDR to general bus of the softcore using GP bus.--------")
axi_ddr = axi.AXIInterface(axi_S_GP0.data_width, axi_S_GP0.address_width, "byte", axi_S_GP0.id_width)
#axi_ddr = axi.AXIInterface(axi_S_HP0.data_width, axi_S_HP0.address_width, addressing="byte", axi_S_HP0.id_width)
self.comb += axi_ddr.connect_mapped(axi_S_GP0, map_fct_ddr)
#self.comb += axi_ddr.connect_mapped(axi_S_HP0, map_fct_ddr)
self.bus.add_slave(
name="main_ram",slave=axi_ddr,
region=SoCRegion(
origin=ddr_addr,
size=sdram_size,
mode="rwx"
)
)
print("---------------------------- End ----------------------------------------------")
# Video VGA ------------------------------------------------------------------------------------
if with_video_terminal or with_video_framebuffer:
if with_video_terminal:
self.videophy = VideoVGAPHY(platform.request("vga"), clock_domain="vga")
self.add_video_terminal(phy=self.videophy, timings="800x600@60Hz", clock_domain="vga")
if with_video_framebuffer:
#TODO
print("Not implemented yet!")
# Video HDMI ------------------------------------------------------------------------------------
if with_hdmi_video_terminal or with_hdmi_video_framebuffer:
if with_hdmi_video_terminal:
self.videophy = VideoS7HDMIPHY(platform.request("hdmi_out"), clock_domain="hdmi")
self.add_video_terminal(phy=self.videophy, timings="800x600@60Hz", clock_domain="hdmi")
if with_hdmi_video_framebuffer:
from my_modules import dvi_framebuffer
platform.add_source("./my_modules/dvi_framebuffer.v")
self.cfg_bus = cfg_bus = axi.AXILiteInterface(address_width=32, data_width=32, addressing="byte")
axi_S_GP1 = zynq.add_axi_gp_slave(clock_domain = self.crg.cd_hdmi.name)
self.out_bus = out_bus = axi.AXIInterface(axi_S_GP1.data_width, axi_S_GP1.address_width, "byte", axi_S_GP1.id_width)
self.comb += out_bus.connect_mapped(axi_S_GP1, map_fct_ddr)
self.submodules.hdmi_framebuffer = hdmi_framebuffer = dvi_framebuffer.dvi_framebuffer(self.crg.cd_hdmi.clk, self.crg.cd_hdmi5x.clk, self.crg.rst, Signal(), cfg_bus, out_bus, platform.request("hdmi_out"))
self.bus.add_slave("framebuffer_ctrl", cfg_bus, region=SoCRegion(origin=0x87000000, size=0x10000, mode="rw", cached=False))
#Leds -------------------------------------------------------------------------------------
if with_led_chaser:
self.leds = LedChaser(
pads = platform.request_all("user_led"),
sys_clk_freq = sys_clk_freq)
# XADC -------------------------------------------------------------------------------------
if with_xadc:
self.xadc = XADC()
# DNA --------------------------------------------------------------------------------------
if with_dna:
self.dna = DNA()
self.dna.add_timing_constraints(platform, sys_clk_freq, self.crg.cd_sys.clk)
def finalize(self, *args, **kwargs):
super(BaseSoC, self).finalize(*args, **kwargs)
if self.cpu_type == "zynq7000":
libxil_path = os.path.join(self.builder.software_dir, 'libxil')
os.makedirs(os.path.realpath(libxil_path), exist_ok=True)
lib = os.path.join(libxil_path, 'embeddedsw')
if not os.path.exists(lib):
os.system("git clone --depth 1 https://github.com/Xilinx/embeddedsw {}".format(lib))
os.makedirs(os.path.realpath(self.builder.include_dir), exist_ok=True)
for header in [
'XilinxProcessorIPLib/drivers/uartps/src/xuartps_hw.h',
'lib/bsp/standalone/src/common/xil_types.h',
'lib/bsp/standalone/src/common/xil_assert.h',
'lib/bsp/standalone/src/common/xil_io.h',
'lib/bsp/standalone/src/common/xil_printf.h',
'lib/bsp/standalone/src/common/xstatus.h',
'lib/bsp/standalone/src/common/xdebug.h',
'lib/bsp/standalone/src/arm/cortexa9/xpseudo_asm.h',
'lib/bsp/standalone/src/arm/cortexa9/xreg_cortexa9.h',
'lib/bsp/standalone/src/arm/cortexa9/xil_cache.h',
'lib/bsp/standalone/src/arm/cortexa9/xparameters_ps.h',
'lib/bsp/standalone/src/arm/cortexa9/xil_errata.h',
'lib/bsp/standalone/src/arm/cortexa9/xtime_l.h',
'lib/bsp/standalone/src/arm/common/xil_exception.h',
'lib/bsp/standalone/src/arm/common/gcc/xpseudo_asm_gcc.h',
]:
shutil.copy(os.path.join(lib, header), self.builder.include_dir)
write_to_file(os.path.join(self.builder.include_dir, 'bspconfig.h'),
'#define FPU_HARD_FLOAT_ABI_ENABLED 1')
write_to_file(os.path.join(self.builder.include_dir, 'xparameters.h'), '''
#ifndef __XPARAMETERS_H
#define __XPARAMETERS_H
#include "xparameters_ps.h"
#define STDOUT_BASEADDRESS XPS_UART1_BASEADDR
#define XPAR_PS7_DDR_0_S_AXI_BASEADDR 0x00100000
#define XPAR_PS7_DDR_0_S_AXI_HIGHADDR 0x3FFFFFFF
#endif
''')
elif self.with_ps7:
libxil_path = os.path.join(self.builder.software_dir, 'libxil')
os.makedirs(os.path.realpath(libxil_path), exist_ok=True)
lib = os.path.join(libxil_path, 'embeddedsw')
if not os.path.exists(lib):
os.system("git clone --depth 1 https://github.com/Xilinx/embeddedsw {}".format(lib))
os.makedirs(os.path.realpath(self.builder.include_dir), exist_ok=True)
for header in [
'XilinxProcessorIPLib/drivers/uartps/src/xuartps_hw.h',
'XilinxProcessorIPLib/drivers/uartps/src/xuartps.h',
'lib/bsp/standalone/src/common/xil_types.h',
'lib/bsp/standalone/src/common/xil_assert.h',
'lib/bsp/standalone/src/common/xil_io.h',
'lib/bsp/standalone/src/common/xil_printf.h',
'lib/bsp/standalone/src/common/xplatform_info.h',
'lib/bsp/standalone/src/common/xstatus.h',
'lib/bsp/standalone/src/common/xdebug.h'
]:
shutil.copy(os.path.join(lib, header), self.builder.include_dir)
write_to_file(os.path.join(self.builder.include_dir, 'uart_ps.h'), '''
#ifdef __cplusplus
extern "C" {
#endif
#include "xuartps_hw.h"
#include "system.h"
#define CSR_UART_BASE
#define UART_POLLING
static inline void uart_rxtx_write(char c) {
XUartPs_WriteReg(STDOUT_BASEADDRESS, XUARTPS_FIFO_OFFSET, (uint32_t) c);
}
static inline uint8_t uart_rxtx_read(void) {
return XUartPs_ReadReg(STDOUT_BASEADDRESS, XUARTPS_FIFO_OFFSET);
}
static inline uint8_t uart_txfull_read(void) {
return XUartPs_IsTransmitFull(STDOUT_BASEADDRESS);
}
static inline uint8_t uart_rxempty_read(void) {
return !XUartPs_IsReceiveData(STDOUT_BASEADDRESS);
}
static inline void uart_ev_pending_write(uint8_t x) { }
static inline uint8_t uart_ev_pending_read(void) {
return 0;
}
static inline void uart_ev_enable_write(uint8_t x) { }
#ifdef __cplusplus
}
#endif
''')
write_to_file(os.path.join(self.builder.include_dir, 'xil_cache.h'), '''
#ifndef XIL_CACHE_H
#define XIL_CACHE_H
#include "xil_types.h"
#include "xparameters.h"
#include "system.h"
#ifdef __cplusplus
extern "C" {
#endif
void Xil_DCacheFlush(void);
void Xil_ICacheFlush(void);
void Xil_L2CacheFlush(void);
#ifdef __cplusplus
}
#endif
#endif
''')
write_to_file(os.path.join(self.builder.include_dir, 'xil_cache.c'), '''
#include "system.h"
void Xil_DCacheFlush(void){
flush_cpu_dcache();
}
void Xil_ICacheFlush(void) {
flush_cpu_icache();
}
void Xil_L2CacheFlush(void) {
flush_l2_cache();
}
''')
write_to_file(os.path.join(self.builder.include_dir, 'xparameters.h'), '''
#ifndef __XPARAMETERS_H
#define __XPARAMETERS_H
#include "generated/mem.h"
#define STDOUT_BASEADDRESS PS_IO_BASE + 0x1000
#define STDIN_BASEADDRESS PS_IO_BASE + 0x1000
#define XPAR_PS7_DDR_0_S_AXI_BASEADDR MAIN_RAM_BASE
#define XPAR_PS7_DDR_0_S_AXI_HIGHADDR MAIN_RAM_BASE + MAIN_RAM_SIZE
#endif
''')
write_to_file(os.path.join(self.builder.include_dir, 'xpseudo_asm.h'), '''
#ifndef XPSEUDO_ASM_H
#define XPSEUDO_ASM_H
#endif
''')
write_to_file(os.path.join(self.builder.include_dir, 'bspconfig.h'), '''
#ifndef XPSEUDO_ASM_H
#define XPSEUDO_ASM_H
#endif
''')
# Build --------------------------------------------------------------------------------------------
def main():
from litex.build.parser import LiteXArgumentParser
parser = LiteXArgumentParser(platform=digilent_zybo_z7_20.Platform, description="LiteX SoC on Zybo Z7/original Zybo")
parser.add_target_argument("--sys-clk-freq", default=125e6, type=float, help="System clock frequency.")
parser.add_target_argument("--variant", default="original", help="Board variant (z7-10, z7-20 or original).")
parser.add_target_argument("--with-ps7", action="store_true", help="Add the PS7 as slave for soft CPUs.")
parser.add_target_argument("--with-usb-host", action="store_true", help="Enable USB host support.(PMOD)")
parser.add_target_argument("--with-xadc", action="store_true", help="Enable 7-Series XADC.")
parser.add_target_argument("--with-dna", action="store_true", help="Enable 7-Series DNA.")
sdopts = parser.target_group.add_mutually_exclusive_group()
sdopts.add_argument("--with-spi-sdcard", action="store_true", help="Enable SPI-mode SDCard support.(PMOD)")
sdopts.add_argument("--with-sdcard", action="store_true", help="Enable SDCard support.(PMOD)")
viopts = parser.target_group.add_mutually_exclusive_group()
viopts.add_argument("--with-video-terminal", action="store_true", help="Enable Video Terminal (VGA).")
viopts.add_argument("--with-video-framebuffer", action="store_true", help="Enable Video Framebuffer (VGA).")
viopts.add_argument("--with-hdmi-video-terminal", action="store_true", help="Enable Video Terminal (HDMI).")
viopts.add_argument("--with-hdmi-video-framebuffer", action="store_true", help="Enable Video Framebuffer (HDMI).")
args = parser.parse_args()
soc = BaseSoC(
sys_clk_freq = args.sys_clk_freq,
variant = args.variant,
with_ps7 = args.with_ps7,
with_xadc = args.with_xadc,
with_dna = args.with_dna,
with_usb_host = args.with_usb_host,
with_video_terminal = args.with_video_terminal,
with_video_framebuffer = args.with_video_framebuffer,
with_hdmi_video_terminal = args.with_hdmi_video_terminal,
with_hdmi_video_framebuffer = args.with_hdmi_video_framebuffer,
**soc_core_argdict(args)
)
if args.with_spi_sdcard:
soc.platform.add_extension(digilent_zybo_z7_20._sd_card_pmod_io)
soc.add_spi_sdcard(software_debug=True)
if args.with_sdcard:
soc.platform.add_extension(digilent_zybo_z7_20._sd_card_pmod_io)
soc.add_sdcard(software_debug=True)
builder = Builder(soc, **builder_argdict(args))
if args.cpu_type == "zynq7000" or args.with_ps7:
soc.builder = builder
builder.add_software_package('libxil')
builder.add_software_library('libxil')
if args.build:
builder.build(**parser.toolchain_argdict)
if args.load:
prog = soc.platform.create_programmer()
prog.load_bitstream(builder.get_bitstream_filename(mode="sram"), device=1)
if __name__ == "__main__":
main()```
Maybe the reason why this isn't working, is that if let's say, we specify that the memory is on the AXI bus at 0x40000000, then naxriscv can access it, but the memory accesses on that axi bus will be emited without that 0x40000000 offset.
I don't know what is the expected behaviour from the zynq / litex side
Maybe the reason why this isn't working, is that if let's say, we specify that the memory is on the AXI bus at 0x40000000, then naxriscv can access it, but the memory accesses on that axi bus will be emited without that 0x40000000 offset.
I don't know what is the expected behaviour from the zynq / litex side
I dont quite understand now. If it has it on the mbus then accesses on the mbus are done without the 0x40000000 offset? So a access to 0x45000000 thru mbus emits address 0x05000000 ?
So a access to 0x45000000 thru mbus emits address 0x05000000 ?
Yes, i need to double check but that is quite possible.
So a access to 0x45000000 thru mbus emits address 0x05000000 ?
Yes, i need to double check but that is quite possible.
I tried it with mbus connected to DRAM with bus address offset stripping (the 0x40000000) still when DRAM is connected to the mbus it locks up at the bootup mem_test. I dont know whats wrong, before experimenting with NaxRiscv I was using Rocket and that had its memory bus connected straight to the DRAM and it worked like a charm. Before that I used VexRiscv which had the same main_ram address 0x40000000 as NaxRiscv and it was working too. Even the Microwatt and Serv were able to work with it. NaxRiscv is the only one doing this. Im out of ideas.
I'm looking at it. Trying to get the offset to be preserved.
Also, did you tried vexriscv_smp cpu ?
I'm looking at it. Trying to get the offset to be preserved.
Also, did you tried vexriscv_smp cpu ?
Yes, and it was working and booting linux just fine, but not with memory bus to dram as the memory bus of the vexriscv smp has litedram interface not axi4 as the PS7 of the zynq has. I also successfully booted linux on Rocket and Microwatt and I wanted to move to NaxRiscv for performance reasons, also I can fit 2 NaxRiscv cores into my FPGA and with Linux variant of Rocket I can only fit one.
Very strange behaviour, thru mbus it locks up at mem_test evne when i specified no L2 cache. I also tried it again with dram to pbus without L2 to see if it makes a difference. Still the same behaviour. EDIT: Now I stumbled uppon an interesting thing, it has problems with some addresses in the 0x40000000-0x41000000 region(seems like that its with the address 0x40c00000) and then on every 0xX0be0000, everything inbetween tests OK... strange.
Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot: Even linux booted when I specified in device tree that those addresses are reserved memory regions.
Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot:
I do not expect it to boot in that configuration. Also performances will be very bad.
I pushed a potential fix with : https://github.com/enjoy-digital/litex/pull/1940
With this one, mBus accesses will preserve the full 32 bits address, instead of removing the 0x40000000 offset.
I don't have any zynq board, let's me know how it goes :)
Also, keep in mind, VexiiRiscv is very close to get feature parity with performance not too far away. (WIP)
Also if the DRAM is connected thru pbus and I just bypass the non functional addresses then it can even boot:
I do not expect it to boot in that configuration. Also performances will be very bad.
I pushed a potential fix with : #1940
With this one, mBus accesses will preserve the full 32 bits address, instead of removing the 0x40000000 offset.
I don't have any zynq board, let's me know how it goes :)
Also, keep in mind, VexiiRiscv is very close to get feature parity with performance not too far away. (WIP)
I tried it but still locked up on mem_test when DRAM is connected to mbus: I dont understand this behaviour, its very strange.
Did you checked it passes the timings ?
Did you checked it passes the timings ?
Yes, it passes. Timings are in positive numbers. no negative slack or hold.
Weird, I just tested on Digilent nexys video, can run debian just fine Did you deleted the pythondata-cpu-naxriscv/pythondata_cpunaxriscv/verilog/NaxRiscvLitex????.v files before retrying with the fixes ? Else it will not regenerate the naxriscv SoC and use it as a cache.
Yes, I deleted the generated verilog files. I dont understand this strange behavior at all. My goal is to boot Debian/Fedora on it as I did with Rocket. I am thinking maybe adding a litescope to or jtagbone to the SoC to see the signals on mbus<->dram bus to see what is going on. EDIT: Alternatively I can give you remote access to my workstation for you to check out the code and behaviour of the SoC faster.
to see the signals on mbus<->dram bus to see what is going on.
Yes that would be a the way to proceed. probing the mbus, aswell as probing the dbus which get out of the CPU itself
Idealy, instead of relying on hardware debug, we would run a simulation, that would give use full visibility on what is happening.
to see the signals on mbus<->dram bus to see what is going on.
Yes that would be a the way to proceed. probing the mbus, aswell as probing the dbus which get out of the CPU itself
Idealy, instead of relying on hardware debug, we would run a simulation, that would give use full visibility on what is happening.
I dont know if simulation would do anything as the mbus is connected to the PS7 block which contains hardened components not softcores. Also you cant connect to UART in a Vivado simulation. So the only way to somehow usefully debug it is the debug options in the Litex.
Here it is with Rocket softcore. Memory bus of the Rocket connected straight to DRAM. Working as with any other softcore except the NaxRiscv. Even whole ram is ther and OK.
can you send your custom board files ? for me to recreate. Maybe it is the memory region definition which is messed up.
can you send your custom board files ? for me to recreate. Maybe it is the memory region definition which is messed up.
target and platform files ?
yes
yes
Here you go. custom_zybo.zip There is the platform and target file and also the modified Zynq7000 core file so that the HP ports have the ACLK taken from softcore bus, otherwise it will not work. Also use this pull request, im using that function: https://github.com/enjoy-digital/litex/pull/1522
Any news ?
Just tried now.
[info] LitexMemoryRegion(SM(0x80000000, 0x80000000),io,p)
[info] LitexMemoryRegion(SM(0x0, 0x20000),rxc,p)
[info] LitexMemoryRegion(SM(0x10000000, 0x2000),rwxc,p)
[info] LitexMemoryRegion(SM(0x40000000, 0x40000000),rwxc,p)
[info] LitexMemoryRegion(SM(0xf0000000, 0x10000),rw,p)
0x40000000, 0x40000000 => address, size
which would normaly mean that there should be 1 GB accessible at 0x40000000
The thing is, "[info] LitexMemoryRegion(SM(0x80000000, 0x80000000),io,p)" is overlapping that memory range. You need to push that memory region (0x80000000, 0x80000000) to (0xC0000000, 0x40000000)
note it should not be on ",p)" but on ,"m)" else you will get bad performances (meaning not pbus, but mbus)
io_regions = {0x4000_0000: 0xbc00_0000} # Origin, Length. Does it mean that the DDR is mapped as if it was a io region ?
Uncomment these lines:
#if hasattr(self.cpu, "add_memory_buses"):
#self.cpu.add_memory_buses(address_width = 32, data_width = 64)
That would connect the DRAM to mbus instead of pbus. Also you can change the parameter sdram_size to half so ut doesnt overlap with anything(if i understood correctly).
The thing is, "[info] LitexMemoryRegion(SM(0x80000000, 0x80000000),io,p)" is overlapping that memory range.
Ahhh foget that, my bad XD
So, the only way to debug i would say would be with the logic analyser.
Also one thing, with NaxRiscv you realy need to use the argument : --bus-standard axi-lite (the axi-lite <> wishbone) is bugged i think. That may explaine the crashes you had before. (not talking of the memory range)
The thing is, "[info] LitexMemoryRegion(SM(0x80000000, 0x80000000),io,p)" is overlapping that memory range.
Ahhh foget that, my bad XD
So, the only way to debug i would say would be with the logic analyser.
Also one thing, with NaxRiscv you realy need to use the argument : --bus-standard axi-lite
(the axi-lite <> wishbone) is bugged i think. That may explaine the crashes you had before. (not talking of the memory range)
Yeah. I know that, I have the pbus as axi-lite and the mbus to dram is full axi4. I dont usually use wishbone when not specifically needed. I try to always use the bus standard the core has as its native output. I also tried to add the logic analyser but somehow I cant figure it out somehow, i started to writing the necessary parts into code as you can see in the files. Maybe you can write it in properly, for me when I tried it then I got no output in the console and no output from analyser.
Hmm, so yes, at this point realy need to add probes / logic analyser, and trace memory access at 0x60000000 to see until which point they reach. Also, note that today i also got VexiiRiscv to run debian, there is one hardware bug i know of which trigger instruction access fault on some specific binaries, but soon it should be fixed.
Hmm, so yes, at this point realy need to add probes / logic analyser, and trace memory access at 0x60000000 to see until which point they reach.
Also, note that today i also got VexiiRiscv to run debian, there is one hardware bug i know of which trigger instruction access fault on some specific binaries, but soon it should be fixed.
Wait what. VexRiscv running Debian? Distros require RV64GC and VexRiscv is RV32GC or did I missed something?
did I missed something?
Yes, you missed the "ii" VexiiRiscv, not VexRiscv ^^
did I missed something?
Yes, you missed the "ii"
VexiiRiscv, not VexRiscv ^^
Ooooh, now I checked it out and it looks cool. When it gets integrated into Litex? How did you got it to the FPGA board? Im totaly new and lost in Spinal/Scala.
There is a PR which is WIP and not up to date : https://github.com/enjoy-digital/litex/pull/1923
I will probably need a few weeks more to get things fixed and cleaned up
There is a PR which is WIP and not up to date :
https://github.com/enjoy-digital/litex/pull/1923
I will probably need a few weeks more to get things fixed and cleaned up
I will try it out. Can you check out the analyser in the files I gave you to see if its okay?
I will try it out.
I will let you know when it is ready ^^ don't bother until then.
Can you check out the analyser in the files I gave you to see if its okay?
Which file ? i haven't seen any ? Yes i can.
I will try it out.
I will let you know when it is ready ^^ don't bother until then.
Can you check out the analyser in the files I gave you to see if its okay?
Which file ? i haven't seen any ? Yes i can.
The target/platform files. I added it into it before you asked for them to see them.
Ahhhh i did tried them, to check the memory region things. But can't test anything on hardware.
@JoyBed Got the vexiiriscv bug fixed, everything seems stable now, remains a bit of timing optimization and then it should be good for usages.
@JoyBed https://github.com/enjoy-digital/litex/pull/1923 is now good.
Still a few optimisations to do, but take take time ^^
I'm using it to run debian with for instance :
python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv --with-jtag-tap --bus-standard axi-lite --vexii-args=" \
--allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \
--regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \
--fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \
--lsu-l1 --lsu-l1-ways=4 --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2 --with-lsu-bypass \
--with-btb --with-ras --with-gshare --relaxed-branch" --cpu-count=4 --with-jtag-tap --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no --sys-clk-freq 100000000
@JoyBed #1923 is now good.
Still a few optimisations to do, but take take time ^^
I'm using it to run debian with for instance :
python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=vexiiriscv --with-jtag-tap --bus-standard axi-lite --vexii-args=" \ --allow-bypass-from=0 --debug-privileged --with-mul --with-div --div-ipc --with-rva --with-supervisor --performance-counters 9 \ --regfile-async --xlen=64 --with-rvc --with-rvf --with-rvd --fma-reduced-accuracy \ --fetch-l1 --fetch-l1-ways=4 --fetch-l1-mem-data-width-min=64 \ --lsu-l1 --lsu-l1-ways=4 --lsu-l1-mem-data-width-min=64 --lsu-l1-store-buffer-ops=32 --lsu-l1-refill-count 2 --lsu-l1-writeback-count 2 --lsu-l1-store-buffer-slots=2 --with-lsu-bypass \ --with-btb --with-ras --with-gshare --relaxed-branch" --cpu-count=4 --with-jtag-tap --with-video-framebuffer --with-sdcard --with-ethernet --with-coherent-dma --l2-byte=262144 --update-repo=no --sys-clk-freq 100000000
How many LUTs does a dualcore VexiiRiscv takes ?
For which ISA ? RV64IMAFDC to run debian ? Or something softcore friendly like RV32IMA to just run linux ?
For RV64IMAFDC single issue everything enabled with memory coherency and single core, it is around 12K lut, at learly 100 Mhz on Artix 7 -1 (slow speed grade). The FPU take a lot of space, around 5K lut per core. RVC is also a pain in the ass ^^
For RV64IMAFDC single issue everything enabled with memory coherency and single core, it is around 12K lut, at learly 100 Mhz on Artix 7 -1 (slow speed grade).
The FPU take a lot of space, around 5K lut per core. RVC is also a pain in the ass ^^
Wow, a Debian able core only in 12k LUTs? Thats amazing! I can comfortably fit then even 4 of them in my FPGA!
Note a recent change in litex broke things XD It works up to litex 86a43c9f
Note a recent change in litex broke things XD It works up to litex 86a43c9
Just reverting this will fix things ?
I mean, you can checkout https://github.com/enjoy-digital/litex/commit/86a43c9ff7141d625a92d75c1e9f5d99bb2d69ab and it will work, but later it will not.
Otherwise there is two commit to revert to get things to work :
I stumbled upon an interesting bug. Linux is unable to boot if the memory size is above 512mb. In litex BIOS the whole ram passes the tests and its working but as linux start to boot it cant boot if I have ram size specified above 512mb. Where does this limitation come from?