lawrie commented 3 years ago

I am trying to run your ice40-playground memtest project to test Hyperram on a Blackice MX board.

I successfully ran it on an iCEBreaker board. My Pmod is the single Hyperram chip version.

I changed the pcf files, and had to change the PLL to PLL_CORE.

I then get an error in the nextpnr placement script:

RuntimeError: Unable to place SerDesBlock(0/2 OSERDES NegEdge Delay)
ERROR: Error occurred while executing Python script /home/lawrie/ice40-playground/cores/no2ice40//sw/serdes-nextpnr-place.py

Presumably that is expected and the placement script needs changing for that board and the pins that it uses.

Can you give me any hints on what changes I need to make to the script?

The creator of the Blackice Mx board, @folknology, thought that you may have a script for an hx8k, as we believe you have Hyperram running on the Glasgow board.

My issue on your dfu bootloader was so successful, that I thought I would try another one :)

smunaut commented 3 years ago

What's the pinout you're trying to use ? Currently the placement script only works with the up5k mostly because it only supports top/bottom IO banks.

lawrie commented 3 years ago

These are the pins I am currently using:

# Quad HyperRAM
set_io --warn-no-port hram_dq[0] 16
set_io --warn-no-port hram_dq[1] 15
set_io --warn-no-port hram_dq[2] 10
set_io --warn-no-port hram_dq[3] 9
set_io --warn-no-port hram_dq[4] 11
set_io --warn-no-port hram_dq[5] 12
set_io --warn-no-port hram_dq[6] 17
set_io --warn-no-port hram_dq[7] 18

set_io --warn-no-port hram_rwds 31

set_io --warn-no-port hram_ck 28

set_io --warn-no-port hram_rst_n 32

set_io --warn-no-port hram_cs_n[0] 33
set_io --warn-no-port hram_cs_n[1] 37
set_io --warn-no-port hram_cs_n[2] 34
set_io --warn-no-port hram_cs_n[3] 38

There are three double Pmod slots on the Blackice Mx. @folknology says that none of them use just top banks pins, but I should try synthesis using just top banks pins.

lawrie commented 3 years ago

When I use mainly bottom pins, a lot of groups get placed but I get an error on group 9:

Group 9 for IO BEL(x=22, y=0, z=1)
    SerDesBlock(0/2 OSERDES NegEdge Delay)  : 1 LCs placed @ BEL(x=22, y=1, z=0)
    SerDesBlock(0/1 OSERDES Shift)          : 4 LCs placed @ BEL(x=22, y=1, z=1)
    SerDesBlock(0/0 OSERDES Capture)        : 4 LCs placed @ BEL(x=22, y=2, z=0)

ERROR: No Bel named 'X8/Y3/lc5' located for this chip (processing BEL attribute on 'hram_phy_I.genblk1[4].iserdes_dq_I.genblk2.genblk2[5].dff_scap_I.genblk1.dff_I_DFFLC')
1 warning, 1 error

smunaut commented 3 years ago

In the Placer class initializer, you'll need a different grid setup for the HX8k :

        for y in (range(28,33) if self.top else range(1,6)):
            for x in range(1,33):
                # Invalid, used by SPRAM
                if x in [8,25]:
                    continue
                self.m_fwd[BEL(x,y,0)] = PlacerSite(BEL(x,y, 0))

(because the fabric size is different and the columns that are 'used' by the BRAMs are different)

smunaut commented 3 years ago

I'll try and rewrite a placer that "just works" for all banks and supports both UP5k and HX8k. I needed to do it for the Glagow anyway (initial tests I did it "by hand"). It wasn't a priority until now since nobody else than me was trying to use it, but if there is any interest, it shouldn't be too hard.

lawrie commented 3 years ago

I have the memtest running on the board but with the placement script disabled and the BEL attributes removed.

[+] Training CS=3
0b111101 ff6dbfbe
0b111111 bffb6b31
0b111101 f56dafbe
[.]  delay= 0 -> Failed
0b110010 600dbabf
0b111010 b16900b6
0b111010 600fbabd
[.]  delay= 5 -> Failed
0b110000 0000ffff
0b111010 600dbabe
0b111010 b16b00b5
[.]  delay=10 -> cap_latency=4, phase=0
0b110000 0000ffff
0b111010 600dbabe
0b111010 b16b00b5
[.]  delay=15 -> cap_latency=4, phase=0
[+] Compiling training results
[.]  delay= 0 -> Invalid
[.]  delay= 5 -> Invalid
[.]  delay=10 -> cap_latency=4, phase=0
[.]  delay=15 -> cap_latency=4, phase=0
[+] Core configured for cmd_latency=2, capture_latency=4, phase=0, delay=15
[+] Testing CS=3
[.]  All good !

I had to make changes to the pll and uart for the Blackice board as its built-in uart goes via an STM32 co-processor and only supports 115200 baud. That would have needed a uart divider of more than 8 bits, so I reduced the 4x clock to 118MHz.

It runs very slowly. I had to reduce the amount of memory it tested to get it to finish in a reasonable time. I don't know if the slow uart is slowing it down.

lawrie commented 3 years ago

@smunaut Another question on the Hyperram, which you may be able to answer is whether the Hyperram could be used to simulating SRAM for retro computers. This is currently done of boards such as the Ulx3s and Blackice Mx using SDRAM.

As an example, the NES implementation uses an 85Mhz clock for SDRAM and needs to read or write an 8-bit value in 8 clock cycles. Would it be possible to achieve that fixed latency with Hyperram?

smunaut commented 3 years ago

No, the hyperram has higher latency than that. The memory itself would need 10 cycles if I counted correctly, and then you'd at the very least have 2 cycles for the IO registers. And here on the ice40, the manual serdes I implemented to run the hyperram faster have even more IO latency.

smunaut commented 3 years ago

I just pushed a hk8k branch on this submodule could you try it out and tell me if what hack you still need to make it go through ?

lawrie commented 3 years ago

I now get this error:

ERROR: No Bel named 'X12/Y0/gb' located for this chip (processing BEL attribute on 'sysmgr_I.crg_I.gbuf_1x_I')

smunaut commented 3 years ago

Huh ... I commented those out ... are you sure you're using the whole branch and not just the placement script ? Also make clean before hand.

lawrie commented 3 years ago

Ah, no I used just the placement script.

I am not sure of the best way to get your new branch with the submodule.

What git commands would you suggest?

It did work for me when I commented out the BELs on the global buffers.

lawrie commented 3 years ago

I deleted no2ice40 from cores and cloned the branch there.

It is now working.

lawrie commented 3 years ago

I am trying different speeds to run the Hyperram on the Blackice board.

Your memtest project was set to run it at 147MHz (clk_4x), which is what I ran it on the iCEBreaker.

I have tried 200MHz, which gave some read errors, but 160Mhz was OK.

I am a bit confused by how the hbus_ck pin is set up. Is the Hyperram being run at the clk_4x speed, or half that?

smunaut commented 3 years ago

It's run at half that IIRC. That's because the clock needs to be at 90 deg out of phase and the data is DDR, and due to IO limitations, everything needs to run of the same clock in the ice40 so that's the best option I found to satify all those weird constraints.

With a carefully chosen pinout and on the HX8k that has 2 PLLs, you could possibly do better with a different architecture, but I wanted this to run on the iCEbreaker.

Note that when you rise the speed you need to change the hyperram mode register too because the programmed read latency doesn't work for higher speeds.

smunaut commented 3 years ago

(and btw, if you want to join the 1bit squared discord, it might be easier for "live debug" to make it run faster)

lawrie commented 3 years ago

I am on the 1bitsquared Discord, but it is a long time since I have posted there, as I don't do many iCEBreaker projects.

What channel do you use for this sort of discussion, including live debug?

smunaut commented 3 years ago

fpga

There is plenty of discussions that have nothing to do with the icebreaker on the server :)

lawrie commented 3 years ago

I just posted on the #fpga stream, but not about this :)

I was thinking of becoming active on the nmigen stream, as I have been doing a few nMigen projects.

I have now asked you a question about the hyperram mode register on the 1bitsquared #fpga Discord channel.

lawrie commented 3 years ago

One thing that I had to do to make the Blackice MX uart work, is to add a DIV_WIDTH parameter to uart2wb in no2misc. Other modules already have DIV_WIDTH.

I forked your repository and made the change - https://github.com/lawrie/no2misc/commit/038a00075233eada279dbf5b6bd72d55a0d82f4d

Do you want me to send you a pull request?

I wasn't sure what to call the auto parameter DIV_WIDTH - 1 to fit in with your naming convention and formatting as DL was already used.

lawrie commented 3 years ago

Another change I have made for BLackice is to put the PLL parameter defines in memtest/rtl/boards.vh.

If you change the PLL speed, you will need to change uart divider, so it seems sensible to change memtest/rtl/top.v to use a UART_DIV define from boards.vh. As I had to change the DIV_WIDTH to 9, that might also be best as a define.

no2fpga / no2ice40

Using Hyperram Pmod on a Blackice MX hx4k board #1

fpga