gatecat / nextpnr-xilinx

Experimental flows using nextpnr for Xilinx devices
ISC License
206 stars 37 forks source link

Sync upstream #56

Closed mmicko closed 1 year ago

mmicko commented 1 year ago

Hi @gatecat, I have done API adaptation of nextpnr-xilinx to mainline nextpnr. There are couple of things missing like https://github.com/gatecat/nextpnr-xilinx/commit/1b587cb521e7c0bd775f371a9f261ea200e06462 parts not being present in mainline nextpnr repository. Testing will be needed of course, but for now looking for feedback if there is anything obviously wrong in last commit (first is removal of old and adding new nextpnr mainline code). Also @hansfbaier please check if this makes it easier for you as well.

To be applied upstream: https://github.com/gatecat/nextpnr-xilinx/commit/c21f5e2b7d211e980bc75bd66722d6d141720969

hansfbaier commented 1 year ago

@gatecat Ahhhh right. That is still from the previous version. That fixed it, now it runs. I totally forgot that bbaexport is part of nextpnr, not prjxray. Thanks!

mmicko commented 1 year ago

Had to fix merge conficts and force push changes, now DSP support is in as well

hansfbaier commented 1 year ago

@mmicko That is awesome! I already tested the previous version, which works well with router1:

      / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2023 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Mar 30 2023 08:52:48
 BIOS CRC passed (c4943d64)

 LiteX git sha1: c3e93620

--=============== SoC ==================--
CPU:        VexRiscv @ 100MHz
BUS:        WISHBONE 32-bit @ 4GiB
CSR:        32-bit data
ROM:        128.0KiB
SRAM:       8.0KiB
L2:     8.0KiB
SDRAM:      256.0MiB 16-bit @ 800MT/s (CL-6 CWL-5)
MAIN-RAM:   256.0MiB

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Write leveling:
  tCK equivalent taps: 32
  Cmd/Clk scan (0-16)
  |1111111111100000| best: 0
  Setting Cmd/Clk delay to 0 taps.
  Data scan:
  m0: |000000000011111111111111| delay: 10
  m1: |000000000011111111111111| delay: 10
Write latency calibration:
m0:0 m1:0 
Write DQ-DQS training:
m0: |000111111111111111100000000000000| delays: 10+-07
m1: |001111111111111110000000000000000| delays: 08+-06
Read leveling:
  m0, b00: |00000000000000000000000000000000| delays: -
  m0, b01: |00001111111111111000000000000000| delays: 10+-06
  m0, b02: |00000000000000000000111111111111| delays: 25+-05
  m0, b03: |00000000000000000000000000000000| delays: -
  m0, b04: |00000000000000000000000000000000| delays: -
  m0, b05: |00000000000000000000000000000000| delays: -
  m0, b06: |00000000000000000000000000000000| delays: -
  m0, b07: |00000000000000000000000000000000| delays: -
  best: m0, b01 delays: 10+-06
  m1, b00: |00000000000000000000000000000000| delays: -
  m1, b01: |00011111111111110000000000000000| delays: 09+-06
  m1, b02: |00000000000000000000111111111111| delays: 25+-05
  m1, b03: |00000000000000000000000000000000| delays: -
  m1, b04: |00000000000000000000000000000000| delays: -
  m1, b05: |00000000000000000000000000000000| delays: -
  m1, b06: |00000000000000000000000000000000| delays: -
  m1, b07: |00000000000000000000000000000000| delays: -
  best: m1, b01 delays: 09+-06
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB     
   Read: 0x40000000-0x40200000 2.0MiB     
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
  Write speed: 35.1MiB/s
   Read speed: 46.8MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
             Timeout
No boot medium found

--============= Console ================--

I am now testing the latest version... stay tuned....

hansfbaier commented 1 year ago

@mmicko Hmmm, donut demo is failing. Need to figure out why...

hansfbaier commented 1 year ago

@mmicko OK the implementation based on the old branch also has the issue. I thought I fixed it there. Maybe I somehow broke it again. Nevertheless, that would not hinder the merge of this branch, which now performs identically in terms of DSP.

hansfbaier commented 1 year ago

@mmicko Oh! I forgot to push that fix to the DSP PR!. Will port it now.

hansfbaier commented 1 year ago

@mmicko This patch should do the trick:

0001-DSP48E1-fix-the-fix-no-AREG-BREG-bit-is-actually-val.patch.gz

hansfbaier commented 1 year ago

@mmicko Yes it does:

       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
   Build your hardware, easily!

 (c) Copyright 2012-2023 Enjoy-Digital
 (c) Copyright 2007-2015 M-Labs

 BIOS built on Mar 30 2023 08:52:48
 BIOS CRC passed (c4943d64)

 LiteX git sha1: c3e93620

--=============== SoC ==================--
CPU:        VexRiscv @ 100MHz
BUS:        WISHBONE 32-bit @ 4GiB
CSR:        32-bit data
ROM:        128.0KiB
SRAM:       8.0KiB
L2:     8.0KiB
SDRAM:      256.0MiB 16-bit @ 800MT/s (CL-6 CWL-5)
MAIN-RAM:   256.0MiB

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Write leveling:
  tCK equivalent taps: 32
  Cmd/Clk scan (0-16)
  |1111111111100000| best: 0
  Setting Cmd/Clk delay to 0 taps.
  Data scan:
  m0: |000000000011111111111111| delay: 10
  m1: |000000000011111111111111| delay: 10
Write latency calibration:
m0:0 m1:0 
Write DQ-DQS training:
m0: |000111111111111111100000000000000| delays: 10+-07
m1: |001111111111111110000000000000000| delays: 08+-06
Read leveling:
  m0, b00: |00000000000000000000000000000000| delays: -
  m0, b01: |00011111111111111000000000000000| delays: 09+-06
  m0, b02: |00000000000000000000111111111111| delays: 25+-05
  m0, b03: |00000000000000000000000000000000| delays: -
  m0, b04: |00000000000000000000000000000000| delays: -
  m0, b05: |00000000000000000000000000000000| delays: -
  m0, b06: |00000000000000000000000000000000| delays: -
  m0, b07: |00000000000000000000000000000000| delays: -
  best: m0, b01 delays: 09+-06
  m1, b00: |00000000000000000000000000000000| delays: -
  m1, b01: |00011111111111110000000000000000| delays: 09+-06
  m1, b02: |00000000000000000001111111111111| delays: 25+-06
  m1, b03: |00000000000000000000000000000000| delays: -
  m1, b04: |00000000000000000000000000000000| delays: -
  m1, b05: |00000000000000000000000000000000| delays: -
  m1, b06: |00000000000000000000000000000000| delays: -
  m1, b07: |00000000000000000000000000000000| delays: -
  best: m1, b02 delays: 25+-05
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB     
   Read: 0x40000000-0x40200000 2.0MiB     
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
  Write speed: 35.1MiB/s
   Read speed: 46.8MiB/s

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
[LITEX-TERM] Received firmware download request from the device.
[LITEX-TERM] Uploading demo.bin to 0x40000000 (7640 bytes)...
[LITEX-TERM] Upload calibration... (inter-frame: 10.00us, length: 64)
[LITEX-TERM] Upload complete (9.9KB/s).
[LITEX-TERM] Booting the device.
[LITEX-TERM] Done.
Executing booted program at 0x40000000

--============= Liftoff! ===============--

LiteX minimal demo app built Apr  3 2023 08:25:02

Available commands:
help               - Show this command
reboot             - Reboot CPU
led                - Led demo
donut              - Spinning Donut demo
mult               - multiplication demo
helloc             - Hello C
litex-demo-app> donut
Donut demo...

                                    $$$$$@@@@@@                                
                                $$###########$$$$$$$                           
                             ####**!!!!!!!!!***###$$$$                         
                           *#**!!!==;;==;=====!!**###$$#                       
                          ***!!==;;;::~:::::=;=!!***#####*                     
                         **!!!=;;::~--,.,,-~:;;;!!!***####*                    
                        !*!!!=;::~-........,-~:;==!!***##**!                   
                       =!*!!!=;:~-,..........-~;;==!!******!                   
                       =!!!!!=;;:-...      ..,~:;=!!!!*****!                   
                       =!*****!!:!=~        -:;==!!!!***!!!=;                  
                       ;!!**######**=      :;==!!!!!!*!!!!!;                   
                       :=!**##$$$$$$$##!!!!!!!***!*!!!!!!==;                   
                        ;!!*##$$$$@@@@$$$###*******!!!!!==;~                   
                        ,;!!*###$$$@@@$$$####*****!!!!!=;:~                    
                         .;=!***###$$$$######***!!!!!==;:~                     
                           ~;=!!***####*******!!!!===;:~.                      
                            .~:==!!!!!****!!!!====;:~~,                        
                               ,~::;=======;=;;;:~~,                           
                                  .,,-~~~~~~---,.     
hansfbaier commented 1 year ago

@mmicko @gatecat I think the easiest way would be that you apply that patch directly to your branch, or should I open up a new PR for xilinx-upstream?

hansfbaier commented 1 year ago

Awesome! Thanks to all!

gatecat commented 1 year ago

Thanks for your help testing too!

mole99 commented 1 year ago

Congratulations 🎉️ Is it planned to merge this into upstream nextpnr? That would be awesome :D

gatecat commented 1 year ago

It's still not really at a quality point overall where I'd want that, but with further database and timing model improvements that may happen.

mole99 commented 1 year ago

Exciting!

hansfbaier commented 1 year ago

@gatecat I am currently working on GTP transceiver support, and I noticed that the VCC pin in each INT_* switchbox actually refers to the X0 coordinate switchbox of its row. Is that intentional, or might this be a merge casualty, or a bug we might have been unaware of? image

Also that mentioned Pin does not seem to be recognized as a Vcc pin, because it keeps searching uphill until it encounters this one: image

(The code referred to is Arch::routeVcc())

gatecat commented 1 year ago

In general, yeah, constant are merged in a row iirc. But the backtrace doesn't really look wrong - can you describe the actual problem you're hitting here?

hansfbaier commented 1 year ago

OK thanks, that's fine then. That's all I wanted to know. Many thanks!

hansfbaier commented 10 months ago

@mmicko @gatecat I am sorry to report that in the openXC7 distribution I had to rebase to the old code base before the upstream code sync. Unfortunately it introduced some very strange bugs and while router1 worked fine in the test above, its runtime was unacceptable for some designs (ie much slower than vivado). Unfortunately I have no time/resources to look into the router2 issue, because my first priorities are to get a number of primitives working (MMCM, GTP, GTX). I still will contribute all my changes to both code bases, though. Fortunately the codebases are quite similar enough so that this is not very tedious.