Open HCIMaker opened 2 years ago
@HCIMaker Thanks for your kind words :)
I don't have a section on writing to the SDRAM, but I think it should be doable. You will need to follow a state machine approach as explained here.
I highly recommend going through the content on this page first and watching this video.
Hope that helps and good luck!
@HCIMaker Thanks for your kind words :)
I don't have a section on writing to the SDRAM, but I think it should be doable. You will need to follow a state machine approach as explained here.
I highly recommend going through the content on this page first and watching this video.
Hope that helps and good luck!
Thank you zangman! I will try it out!
Hello,
I am currently trying that, it seems the FPGA manages to write to the SDRAM. I slightly extended the example SystemVerilog component for QSYS. I am able to watch how the memory changes every second from HPS using memtool, but for some reason the address is still incrementing (by data width, 0x20 = 32B) even it should remain at the start address 0x2000_0000.
The SystemVerilog file looks like this:
module sdram_if # (
parameter ADDR_SIZE = 32,
parameter DATA_SIZE = 256 )
( clk, reset,
avm_m0_read, avm_m0_write, avm_m0_writedata, avm_m0_address, avm_m0_readdata, avm_m0_readdatavalid, avm_m0_byteenable, avm_m0_waitrequest, avm_m0_burstcount,
address, byteenable, read, data_out, write, data_in, busy );
// clk and reset are always required.
input logic clk;
input logic reset;
// Avalon Master ports
output logic avm_m0_read;
output logic avm_m0_write;
output logic [DATA_SIZE-1:0] avm_m0_writedata;
output logic [ADDR_SIZE-1:0] avm_m0_address;
input logic [DATA_SIZE-1:0] avm_m0_readdata;
input logic avm_m0_readdatavalid;
output logic [(DATA_SIZE/8)-1:0]avm_m0_byteenable;
input logic avm_m0_waitrequest;
output logic [10:0] avm_m0_burstcount;
// External conduit
input logic [ADDR_SIZE-1:0] address;
input logic [(DATA_SIZE/8)-1:0] byteenable;
input logic read;
output logic [DATA_SIZE-1:0] data_out;
input logic write;
input logic [DATA_SIZE-1:0] data_in;
output logic busy;
localparam INIT = 3'd0;
localparam READ_START = 3'd1;
localparam READ_END = 3'd2;
localparam WRITE_START = 3'd3;
localparam WRITE_END = 3'd4;
logic [2:0] cur_state;
logic [2:0] next_state;
logic [ADDR_SIZE-1:0] addr;
logic [DATA_SIZE-1:0] data;
logic [(DATA_SIZE/8)-1:0] enable;
// Handling change of the current state to the next requested state
always_ff @(posedge clk) begin
if (reset) begin
cur_state <= INIT;
end else begin
cur_state <= next_state;
if (read) begin
addr <= address;
enable <= byteenable;
end else begin
if (write) begin
addr <= address;
enable <= byteenable;
data <= data_in;
end
end
end
end
// Handling FSM transitions
always_comb begin
next_state = cur_state;
busy <= '0;
case(cur_state)
INIT: begin
if (read) begin
next_state = READ_START;
end else begin
if (write) begin
next_state = WRITE_START;
end
end
end
READ_START: begin
busy <= '1;
if (avm_m0_waitrequest) next_state = READ_START; // Wait here.
else next_state = READ_END;
end
READ_END: begin
busy <= '1;
if (!avm_m0_readdatavalid) next_state = READ_END; // Wait here.
else next_state = INIT;
end
WRITE_START: begin
busy <= '1;
if (avm_m0_waitrequest) next_state = WRITE_START; // Wait here.
else next_state = WRITE_END;
end
WRITE_END: begin
busy <= '1;
next_state = INIT;
end
default: begin
next_state = INIT;
end
endcase
end
// Handling read and write start of each transaction
always_comb begin
avm_m0_address = '0;
avm_m0_read = '0;
avm_m0_write = '0;
avm_m0_byteenable = '0;
avm_m0_burstcount = '0;
avm_m0_writedata = '0;
case(cur_state)
READ_START: begin
avm_m0_address <= addr;
avm_m0_read = '1;
avm_m0_byteenable <= enable;
avm_m0_burstcount = '1;
end
WRITE_START: begin
avm_m0_address <= addr;
avm_m0_write = '1;
avm_m0_writedata <= data;
avm_m0_byteenable <= enable;
avm_m0_burstcount = '1;
end
default: begin
end
endcase
end
// Handling read and write end of each transaction
always_ff @(posedge clk) begin
if (reset) begin
data_out <= '0;
end else begin
case (cur_state)
READ_END: begin
if (avm_m0_readdatavalid) begin
data_out <= avm_m0_readdata;
end
end
default: begin
end
endcase
end
end
endmodule
And my VHDL entity for testing looks like this:
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;
ENTITY test_sdram IS
PORT (
clock : IN STD_LOGIC := '1';
nrst : IN STD_LOGIC := '1';
h2f_nrst : IN STD_LOGIC := '1';
sdram_address : OUT STD_LOGIC_VECTOR(31 DOWNTO 0) := (OTHERS => '0');
sdram_byteenable : OUT STD_LOGIC_VECTOR(31 DOWNTO 0) := (OTHERS => '0');
sdram_read : OUT STD_LOGIC := '0';
sdram_data_read : IN STD_LOGIC_VECTOR(255 DOWNTO 0);
sdram_write : OUT STD_LOGIC := '0';
sdram_data_write : OUT STD_LOGIC_VECTOR(255 DOWNTO 0) := (OTHERS => '0');
sdram_busy : IN STD_LOGIC;
led : OUT STD_LOGIC_VECTOR(7 DOWNTO 0)
);
END ENTITY;
ARCHITECTURE arch OF test_sdram IS
SIGNAL i_led : STD_LOGIC_VECTOR(7 DOWNTO 0) := (OTHERS => '0');
BEGIN
count : PROCESS (nrst, clock) IS
VARIABLE counter : INTEGER := 0;
VARIABLE ticks : INTEGER := 0;
CONSTANT START_ADDR : STD_LOGIC_VECTOR(31 DOWNTO 0) := STD_LOGIC_VECTOR(to_unsigned(16#2000_0000#, 32));
CONSTANT BYTE_ENABLE : STD_LOGIC_VECTOR(31 DOWNTO 0) := (OTHERS => '1');
BEGIN
IF nrst = '0' OR h2f_nrst = '0' THEN
sdram_data_write <= (OTHERS => '0');
sdram_address <= (OTHERS => '0');
sdram_byteenable <= (OTHERS => '0');
sdram_read <= '0';
sdram_write <= '0';
i_led <= (OTHERS => '0');
counter := 0;
ticks := 0;
ELSE
IF rising_edge(clock) THEN
IF counter = 50_000_000 THEN
counter := 0;
ticks := ticks + 1;
sdram_address <= START_ADDR;
sdram_data_write(31 DOWNTO 0) <= STD_LOGIC_VECTOR(to_unsigned(ticks, 32));
sdram_byteenable <= BYTE_ENABLE;
sdram_write <= '1';
i_led(1) <= NOT i_led(1);
ELSE
counter := counter + 1;
sdram_read <= '0';
sdram_write <= '0';
END IF;
END IF;
END IF;
END PROCESS;
led <= i_led;
END ARCHITECTURE;
In the attached image I highlighted the address when I released the reset button - the counter restarts but the address keeps incrementing. Also, the address rolls back to 0x2000_0000 after 1024 write transactions (every 65536B), I am also not sure why.
Can you please help me elaborate on this? Thanks!
So I have already figured it out - without custom QSYS component. I use External Bus to Avalon Bridge with Address Span Extender (otherwise the External Bus' max address range 0x0000_0000 - 0x3fff_ffff (1GB) can't match the f2h_sdram0_data range 0x0000_0000 - 0xffff_ffff). Nice thing is the Address Span Extender also supports address offset of 0x2000_0000 so the data can be addressed from FPGA starting at 0).
If anyone is curious how it is configured in QSYS (I disabled irrelevant components):
As input clock to the memory-related components I use HPS output clock running at 400 MHz. The External Bus is limited to 128-bit data width but it is not a problem for my application. Also I use SW generated reset signal from HPS PIO output (the bit is set using memtool in a systemd service after boot) because it seems FPGA must not write to the SDRAM before Linux boots up.
@vrbadev Cool. I understand very little of that code, but I aspire to being able to do similar things, one day.
What would the code for the HPS side look like, e.g. in C or C++? I take it you have to declare an array, or allocate some memory. Do you get to say where that is, in the address space, or do you allocate it then pass the base address to the FPGA?
The HPS doesn't have to allocate anything. In fact the HPS must avoid conflict when accessing the part of the SDRAM accessed by the fabric, so the parameter mem=512M
in extlinux.conf
must be defined so the OS doesn't use the consecutive part of the RAM (0x2000_0000+) at all as it would for its processes etc. (See the SDRAM tutorial)
So the code at the HPS side is mainly mmap
of the FPGA accessed memory space, if you write to it from HPS then the fabric must ensure there will be no conflicts.
By conflicts I mean concurrent read/write operations on the memory - it is unlikely but still possible. As far as I know you can initialize up to six f2sdram interfaces in Cyclone V and when all of them + the HPS try to read/write the memory at the same moment (the same rising edge of the clock), I am not sure what happens next. Probably it is an undefined behaviour like in the case of the BlockRAM. Then I would suggest to add an additional access-control entity, maybe with a FIFO for memory access requests.
@vrbadev Thanks. Yes, I can see that would be a problem. I was picturing a circular buffer that's written by one side (FPGA or HPS) and read by the other, with also a pointer or index being written, to say where it's got up to. If it's not safe for one side to read a location while the other side is writing to it, then it gets more complicated - even to read the pointer that's set by the other side.
The sort of application I had in mind involves an ADC or two and one or more DACs. Since FPGAs can have cycle perfect timing, which ARM cores aren't so good at, it seems best to do that part on the FPGA, but have circular buffers to exchange data with the HPS, which might do some processing of it before passing it back.
I guess it doesn't matter much which side the RAM buffers belong to, provided both sides can have the access they need to them, but I was thinking of a continuous process, not, for example, the FPGA reading some data into a buffer, setting a flag, then waiting for the HPS to act on it.
Maybe there could be some dual ported RAM blocks used on the FPGA, handling data in real time, and a separate mechanism to transfer between there and the HPS RAM, in blocks, using hand shaking.
@Andy2No Well, depending on your requirements, you may prefer STM32G4 MCUs over FPGAs - these MCUs have rich analog peripherals (multiple ADCs and DACs) which can be served using internal DMA, so the timing can be handled completely just by the built-in hardware. Also you will have no further trouble with compiling the OS and with the bootloader, your solution could be completely bare-metal. The price is also much lower and Nucleo boards are a good starting point for development, with hundreds of examples available online. Also, these chips are much easier to implement and to solder on custom PCBs (because of BGA packages so 4-layer PCBs are required for most of FPGA ICs).
Cyclone V is pretty complicated for beginners and may be an overkill for your application. I need it mainly for real-time processing of a video stream from a CMOS chip which would be impossible without the FPGA part.
Of course there is a dual-port BRAM inside the fabric available so you are right about the idea of the memory transfer mechanism.
@vrbadev My main aim was to learn more about programming FPGAs, and make something worthwhile but, yes, if I learned more about DMA on STM32s, I could just do it all on one of those. Fair point. Perhaps I should try to think of a project that's less suitable for doing on a microcontroller. Really, the point was just to gain some experience with FPGAs though.
Doing it all on the FPGA plus the 512MB or SDRAM that it can see, with no involvement of the Arm CPU, would also be acceptable, and might be a better placer to start.
I am currently attempting a similar task. When I press the KEY button on the DE10-nano board, a write command will be sent to the SDRAM controller. The entire Avalon Master structure is very simple, as shown in the diagram below:
The strange thing is that when I send a write command, the SDRAM controller immediately raises the waitrequest signal to a high level. According to the Avalon protocol, the Master must keep the data unchanged, so the Master will endlessly wait since the waitrequest signal from the SDRAM controller will never return to a low level. When I ignore the waitrequest signal and change the write address and data according to the Master's clock cycle, I found that the SDRAM controller is actually writing data normally. The write cycle is 6 Master clock cycles. This has left me very puzzled. Has anyone encountered the same situation? Any insights would be greatly appreciated!
Hi zangman:
This is an awesome project and wiki! I appreciate your detailed instruction. I wonder whether there is a reversing version of using SDRAM like FPGA directly write the data to SDRAM on HPS side? Thank you very much!