tknopp / RedPitayaDAQServer

Advanced DAQ Tools for the RedPitaya (STEMlab 125-14)
https://tknopp.github.io/RedPitayaDAQServer/dev/
Other
34 stars 9 forks source link

Changing the FPGA bitstream --> Cannot connect to the SCPI server anymore #49

Closed TacHawkes closed 1 year ago

TacHawkes commented 1 year ago

Hi,

I'm currently trying to use and adapt this project for a RedPitaya-based data acquisition. I have followed this guide and this one in order to set up a development toolchain. This worked out fine, i can call your make scripts and use the Vivado project. However, if I follow the second guide, build the bitstream using

make daq_bitfiles

and then copy the .bit file (I think it is a mistake in the guide that it says ".bin") to the RedPitaya using

scp ./build/fpga/xc7z010clg400-1/firmware/RedPitayaDAQServer.runs/impl_1/system_wrapper.bit root@192.168.1.100:/root/apps/RedPitayaDAQServer/bitfiles/daq_xc7z010clg400-1.bit

and reboot, I can no longer connect to the SCPI Server (Connection refused) using the Julia package. The FPGA boots just fine and the Alpine linux is accessible using SSH. However, I do not know if there is an issue with the bitstream. It works fine with the bitstream shipped in the release. Am I missing something with the build flow?

jonschumacher commented 1 year ago

Hi,

could you please SSH to the RP and cd to apps/RedPitayaDAQServer? Then run killall daq_server_scpi and ./build/server/daq_server_scpi. Does the server fail right away? What happens if you connect to it?

Cheers Jonas

tknopp commented 1 year ago

The issue might be that the client needs to be checked out on master using

 add RedPitayaDAQServer:src/client/julia
 or
 dev RedPitayaDAQServer:src/client/julia

if the server is build using master.

TacHawkes commented 1 year ago

Hi,

could you please SSH to the RP and cd to apps/RedPitayaDAQServer? Then run killall daq_server_scpi and ./build/server/daq_server_scpi. Does the server fail right away? What happens if you connect to it?

Cheers Jonas

Hi,

this actually works. The server runs and I can connect to it from Julia. However, If I reboot linux the server seems to be in an errornous state. The log.txt in the server folder only says

I 22-03-28 16:43:47.846730 2066 daq_server_scpi.c:229: Starting RedPitayaDAQServer

If I kill it again and restart it, it works. Seems to be a boot issue!?

The issue might be that the client needs to be checked out on master using

 add RedPitayaDAQServer:src/client/julia
 or
 dev RedPitayaDAQServer:src/client/julia

if the server is build using master.

This was my initial suspicion so I checked out the repo at tag 0.5.0 and used the Julia package at 0.5.0 as well. For the master build I have added the package in Julia using the master branch.

nHackel commented 1 year ago

So if I understood it correctly the 0.5.0 bitfile, server and client all work together, but when you substitute your own bitfile the first server instance hangs up and then the server works after you restarted it?

Are the orange LEDs turned on after your first connection attempt and is your bitfile build from the 0.5.0 commit or from the current master? We have changed the FPGA quite a bit since 0.5.0

jonschumacher commented 1 year ago

Did you bring the server on the RP to the latest origin/master status? I just tested the whole pipeline with origin/master and I can build the FPGA image and the server. The simple example also runs without issues. If I run the FPGA image of origin/master with the 0.5.0 server it crashes.

TacHawkes commented 1 year ago

Ok, my test is as follows:

 git status
On branch master
Your branch is up to date with 'remotes/origin/master'.

nothing to commit, working tree clean

I have used make daq_bitfiles once again to do a clean build of the bitstream.

It still does not work. Now, I even have another problem. If I kill the server, restart it and try to connect using julia I get the following on the linux console:

Scheduler: SCHED_RR; Priority: 20
Load Bitfile
loading bitstream /root/apps/RedPitayaDAQServer/bitfiles/daq_xc7z010clg400-1.bit
Bitsream loaded
Using calibration version 1 with: 0
Decimation = 16 
setRamWriterEnabled: wp 0  wp_old 0  size  0 
Bus error

The server then terminates and Julia reports a timeout (duh...).

TacHawkes commented 1 year ago

Ah I think I see your point. I have missed to update the SCPI server itself... So far I have only upgraded Julia and the Bitstream to the master branch... I will report back if I can fix it now...

nHackel commented 1 year ago

Ah okay. So what's happening there is that on our master branch we changed the address space shared between FPGA and CPU. In particular we moved the buffer that stores our "Sequence" concept into BRAM (and also to a different address range). The CPU then tries to write to the old address space which is not mapped anymore and gets a bus error.

That's one of the potential version mismatchs and symptoms you can watch out for when you try to modify the project.

A version error between server and client results in an unknown SCPI command, which I think crashes the server but I'd have to check again. FPGA and CPU mismatches can either result in a bus error like you had before. A last potential error can happen when the FPGA itself is put into bad state, such as an AXI-module not having a clock or a wrong reset. This can actually hang up the whole RedPitaya Linux

TacHawkes commented 1 year ago

Ok, I think my own modifications (for now I'm testing on an unmodified master branch) are already applied on the newest design. I was already wondering why almost all BRAM is already used :smiley: Luckily my modifications will mostly use FF/LUTs and DSP slices.

make all is still running... I will comment again as soon as I have tested the full build.

TacHawkes commented 1 year ago

Thanks a lot! That fixed it... I can now change the bitstream, reboot and it still works.

jonschumacher commented 1 year ago

Awesome! I will close here now. You can speed up at least the creation of the cores by commenting in the MAKEFLAGS += -j$(NPROCS) in the MAKEFILE. This makes the core generation run in parallel. Running the bitfile generation in parallel crashes my VM but you might just try that aswell. If you need further input, feel free to open another issue.