Closed rsnikhil closed 4 years ago
@swm11 suggested this should be tackled next, now that the Step 1 AWSteria XSIM flow is working. I am now turning my attention to this.
As a first step, I'm just trying to build and run the standard HDK Hello World example. 12 hours of batting broken Python scripts, Vivado licensing issues, etc. Finally, succeeding in creating a DCP (Design Checkpoint), uploading it to a bucket, and submitting it for an AFI build. Note: DCP build took 1h33m. If it works (tomorrow), will then try AWSteria build and run.
Do you have any learnings for others who might want to attempt that battle themselves that come from your point of view, @rsnikhil?
Yes, a bit more than that: I am creating a Makefile that has the steps I took, so I can repeat the process conveniently. I'm writing extensive comments on what each step does, the issues I faced, and how I got around them.
I'm happy to share this.
One of the frustrations is that I was running in an official 'FPGA Developer AMI-1.8.1' provided by Amazon, that ought to work out-of-the-box, but didn't.
Today, successful run of the Hello World example on FPGA on an F1 instance. Now working on taking the AWSteria first-demo (which loads and runs ISA tests) through the flow.
You will likely need to merge https://github.com/bluespec/Flute/pull/24 if you want to have any hope of passing timing, given you're also using that deburster in a synthesised design (within Flute it's only used in the simulated testbench). The Connectal-based design already has that fix in its local copy.
progress update from @rsnikhil:
I've made some progress on the AWSteria bringup on FPGA (running it at 87 MHz):
Background: this first test setup is supposed to do the following (this whole sequence works fine in XSIM simulation): 1 DMA 16 GB to DDR A 2 DMA 16 GB to DDR B 3 DMA 16 GB to DDR C 4 DMA 16 GB to DDR D 5 DMA 16 GB back from DDR A and check it 6 DMA 16 GB back from DDR B and check 7 DMA 16 GB back from DDR C and check it 8 DMA 16 GB back from DDR D and check it 9 DMA a Mem.Hex file into DDR A, representing the code for ISA test rv64ui-p-add 10 Read back 128 bytes of that just to cross-check
11 Write through the OCL port to set verbosity
12 Write through the OCL port to set watch_tohost and tohost_addr
13 Write through the OCL port to "allow" Flute to access DDR (although
it came up running, it's stuck so far, waiting for the response to
its first fetch from the DDR).
14 Poll the HW for a response status word indicating the final value
written to the 'tohost' location
As mentioned before, Step 8 was failing (readback DMA from DDR D).
I changed the program to omit steps 4 and 8 (DMA to and from DDR D).
I get past 9 and 10 (the read-back in 10 indicates it actually DMA'd the data correctly into DDR A).
Up to 10, we're using the DMA_PCIS channel. From 11 onwards we're using the OCL channel.
I get past 11-13; then in step 14, I eventually timeout waiting for Flute's status response.
Note: although we get past 11-13, these are pure writes, so there's no feedback that that hardware reacted correctly. So, somewhere in the OCL interaction, we're going 'silent'.
I'm pondering my next move on this.
I'm also trying to run AWSteria on an AWS instance. I'm haveing trouble installing the xdma drivers. I get
[centos@ip-172-31-59-216 xdma]$ sudo make install
Makefile:25: XVC_FLAGS: .
make -C /lib/modules/3.10.0-1127.8.2.el7.x86_64/build M=/home/centos/git-repos/aws-fpga/sdk/linux_kernel_drivers/xdma modules
make[1]: Entering directory `/usr/src/kernels/3.10.0-1127.8.2.el7.x86_64'
/home/centos/git-repos/aws-fpga/sdk/linux_kernel_drivers/xdma/Makefile:25: XVC_FLAGS: .
Building modules, stage 2.
/home/centos/git-repos/aws-fpga/sdk/linux_kernel_drivers/xdma/Makefile:25: XVC_FLAGS: .
MODPOST 1 modules
make[1]: Leaving directory `/usr/src/kernels/3.10.0-1127.8.2.el7.x86_64'
make -C /lib/modules/3.10.0-1127.8.2.el7.x86_64/build M=/home/centos/git-repos/aws-fpga/sdk/linux_kernel_drivers/xdma modules_install
make[1]: Entering directory `/usr/src/kernels/3.10.0-1127.8.2.el7.x86_64'
INSTALL /home/centos/git-repos/aws-fpga/sdk/linux_kernel_drivers/xdma/xdma.ko
Can't read private key
DEPMOD 3.10.0-1127.8.2.el7.x86_64
make[1]: Leaving directory `/usr/src/kernels/3.10.0-1127.8.2.el7.x86_64'
depmod -a
install -m 644 10-xdma.rules /etc/udev/rules.d
rmmod -s xdma || true
modprobe xdma
modprobe: ERROR: could not insert 'xdma': Exec format error
make: [install] Error 1 (ignored)
[centos@ip-172-31-59-216 xdma]$
Does anyone know what private key it's talking about?
Googling suggests that this might have to do with signed kernels/drivers and the fact that I'm not a kernel developer (who would know the keys). I note that @rsnikhil doesn't seem to have encountered this problem. One difference is that he's running on us-west2 whereas I'm on us-east1.
Look at the output from dmesg
.
Most likely the driver was not compiled to match the kernel. If the kernel version was updated since you compiled it you could see this error message.
@jameyhicks I've just done make clean; make; sudo make install
and get the same failure.
Panic over. dmesg
suggested that xocl
had come back (perhaps due to the reboot after updating the kernel). After removing it again with rmmod
the "Can't read private key" messge still happens; but now xdma does get installed.
How did today's experiments go, @joestoy and @rsnikhil?
I don't have any progress/results beyond what I discussed in today's call. I am waiting to hear @joestoy 's results. For today I turned my attention towards getting the rest of the system up in simulation, using Bluesim/Verilator sim.
We are now successfully running, on FPGA, the AWSteria test described in a previous message in this issues thread. Briefly:
Using the DMA PCIS interface (AXI4, 512b-wide data):
Using the OCL interface (AXI4-Lite, 32b-wide data):
Send some control messages to the Flute SoC (does OCL reads and writes, to check HW FIFO status and deliver data)
Send final control message to the SoC releasing access to DDR4 A (so far Flute is stuck trying to fetch an instruction)
Poll a HW status queue for ISA test completion (more OCL reads and writes)
When completion status is received, report success/fail and exit
The technical problem seems to have been the '1-character typo' in the name of a clock signal for DDR4 D in the top-level SystemVerilog shim. @joestoy mentioned this discovery in last Friday's stand-up. This was causing DDR4 D never to assert it's 'ready' signal which, in turn, was causing everything to get stuck. With this fix, everything (DMA PCIS and OCL) has started working.
AWSteria GFE Steps 1 and 2 were to get things running in standard AWS XSIM flow and in a verilator/Bluesim flow. This step is to take it through the flow for FPGA build/deploy/run on actual AWS cloud hardware (load and run all ISA tests).