trabucayre / openFPGALoader

Universal utility for programming FPGA
https://trabucayre.github.io/openFPGALoader/
Apache License 2.0
1.19k stars 253 forks source link

[Request]Add Xilinx Virtual Cable Support (XVC) #210

Open zhuangzard opened 2 years ago

zhuangzard commented 2 years ago

Could this project add Xilinx Virtual Cable Support? JTAG and programming are very solid in this project, found some other XVC project, like https://github.com/kholia/xvcpi and https://github.com/BerkeleyLab/XVC-FTDI-JTAG are both great project. But they are lack of good FPGA board support and good cable support.

If bring openFPGALoader with XVC, will make this project ever better for remote debug utilizing Raspberry Pi or desktop to program and debug.

XVC is the well documented here: https://github.com/Xilinx/XilinxVirtualCable

trabucayre commented 2 years ago

This protocol is in my TODO list yes. In fact both side must be implemented:

Thanks

zhuangzard commented 2 years ago

That sounds great!! Can not wait for release version of that! You are correct, the server is also very handy for the application of remote programing and debug. Thinking about some small like raspberry Pi or ESP32-Pico could become a remote wifi debugger. That is a very exciting things. BTW, the speed looks still an big issue, don't know that is because the send command via socket create a delay or we could improve that in our version.

trabucayre commented 2 years ago

I have installed @kholia implementation for esp32 on a device to test -> I'm able to start with the client side.

XVC protocol is really similar to JLINK cable so I assume implementation will be fast.

For speed it's true: in my mind there is 2 bottleneck

kholia commented 2 years ago

@trabucayre Hi! I recommend looking at https://github.com/kholia/xvc-pico and https://github.com/tom01h/xvc-pico. The tom01h version is pretty fast in comparison to the esp32 version.

trabucayre commented 2 years ago

Thanks to point this implementation. It's more easy to develop using that (no needs to have wifi or network connection). At the end I have to test with as much as possible implementation (okay all are theoritically exactly the same).

zhuangzard commented 2 years ago

@trabucayre, just implanted the code today with modification of ftdiJtagMPSSE.cpp file, which could send tms and tdi, which is utilizing the data from XVC server. new function called: int FtdiJtagMPSSE::writeTMSTDI(uint8_t *tms, uint8_t *tdi, uint8_t *tdo, uint32_t len)

1, added socket 2, could start server with --xvc command 3, setup port utilizing --port 3721, 3721 is default 4, add testing python readJTAG.py to read the idcode. 5, could send TMS TDI and TDO with xvc command.

This code could read from FPGA IDCODE utilizing the readJTAG.py file, but still not make vivado happy, very close to it. should be time setup. will check into it.

Check code out here is that could help you accelerate implantation. https://github.com/zhuangzard/openFPGALoader/tree/xvc

trabucayre commented 2 years ago

I have to check / analyze your code thanks! The client side is now publicly available: tested with xvc-pico and a cycloneV FPGA (I know using an intel/altera device to test Xilinx's protocol is a bit funny). The real question about server side implementation is trying to avoid at much as possible a bitbanging solution, but this imply to analyze tms and tdi vectors to see it it's toggleClk, tms only or tdi only. But it's, maybe, the key to have a server at the same level as jtag and dfu and to avoid slowing transaction between server and device. This idea may allowing to have a direct compatibility with cables level (it's not to say jtagInterface is perfect -> I'm not sure myself).

zhuangzard commented 2 years ago

Update, just pushed on my work, the code works with Vivado now. https://github.com/zhuangzard/openFPGALoader/tree/xvc Could communication with Vivado and show the chip(s).

I my implantation is using MPSSE, just quick looked the https://github.com/kholia/xvc-pico code, I don't think its involved the solution for client to Jtag MPSSE, they directly write IO pins from Socket server, without utilizing FTDI's MPSSE protocol.

XVC is send three part of data with shift: they are "length", "TMS data" and "TDI data", with current Write_TMS or Write_TDI function, TMS and TDI could not be send at same time, that is I think the most challenge one. Thanks to his/her work, https://github.com/BerkeleyLab/XVC-FTDI-JTAG I created int FtdiJtagMPSSE::writeTMSTDI(uint8_t *tms, uint8_t *tdi, uint8_t *tdo, uint32_t len) function, which could send TMS and TDI in one clock.

For the server, that I think could be a revers process, take the write_TMS and write_TDI into sequence shift:0xXX...0xTMS...0xTDI data formate, which should be applicable.

I will try to work on that, and keep you posted.

trabucayre commented 2 years ago

I have to read carefully your modications! But I'm not sure it is required to modify jtag.cpp since xvc (server side) is different and stateless.

MPSSE is specific to FTDI devices, so there is no reasons to see that in xvc-pico (and with a microcontroler it make sense to use a direct GPIOs access).

BerkeyLab code is interesting because it is not limited to bitbanging pins. I think it may interessting to see how to adapt this one with a highest level of abstract to allows using any cable and not only FTDIs.

zhuangzard commented 2 years ago

I did modification of jtag.cpp at begin because I was tried to use your write_TMS and write_TDI directly to build the send out logic, but with deeper understanding the code, I realize I need to create a new function could send TMS and TDI and read TDO function, I tried to merge your write_TMS and write_TDI function together, and utilizing the BerkeyLab's creative logic, which could be a lower level function . But I agree with you if we don't want to modify other jtag protocol, change jtag.cpp will have no point.

BerkeyLab code is true interesting code, the logic could send TMS and TDI parallel under the USB JTAG limitation is very creative. There is other Bitbang code, where BerkeyLab was reference to, https://github.com/tmbinc/xvcd, also very nice code to read. They are using BitBang, so, don't have the hard logic like Berkeylab.

trabucayre commented 2 years ago

Yep modifying jtag.cpp is not required since xvc is a different approach: merging both in a same class will only increase complexity (but maybe a common super class is required to avoid dupplicate some part of initialisation). BerkeyLab and tmbinc approachs are interesting: tms and tdi vectors are analyzed instead of simply writting bits. Maybe it's not required to keep memory of current JTAG state, and transitions, but at least seeing if vectors contains state moving or shiftDR/shiftIR made sense. In my mind there is 3 cases:

I have to dump a full XVC sequence to test my idea with a replay

zhuangzard commented 2 years ago

I thought that work like logic as you described at the begin, and I thought I could analysis the logic and send write_TMS and write_TDI and toggleClk, but there are not like that way. Here is the Xilinx XVC protocol info https://github.com/Xilinx/XilinxVirtualCable They send data with shift and "shift:<num bits><tms vector><tdi vector>"

<num bits> : is a integer in little-endian mode. This represents the number of TCK clk toggles needed to shift the vectors out

<tms vector>: is a byte sized vector with all the TMS shift in bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte.

<tdi vector>: is a byte sized vector with all the TDI shift in bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte.

<tdo vector>: is a byte sized vector with all the TDO shift out bits Bit 0 in Byte 0 of this vector is shifted out first. The vector is num_bits and rounds up to the nearest byte.

The logic is after receive the socket data, send the TMS maximum for 6 bits if the TDI keep the same (that equal to send TMS and TDI at same time, as TDI stay the same and TDI info be attached to the end), and attach the 7th bit with TDI info. During the 6 bits of send TMS, if TDI changed, then stop the TMS and start write TDI into buffer, but only if the TMS stay the same as the one we just send(because TDI could not attach the TMS, if TMS is changed, stop send TDI at this loop, go back to TMS loop at the begin with new TDI info attached at the 7th bit), in this way system don't have to know what is sending, don't have to record the JTAG state, but just make it send data like sending TMS and TDI at same time with one same clock toggle.

trabucayre commented 2 years ago

Texte source Looks good. But instead of using 6bits for TMS (it's true for FTDI MPSSE but not for others) it's the longest sequence to search. Anyway it's relevant to have a first implementation to have the base material for improvment. There is no need to have directly the most efficient approcach, it may be reviewed in a second time.

Thanks

trabucayre commented 2 years ago

In fact I'm wrong. your proposal with a generic method to pass tms & tdi vector directly to the lowlevel driver seems better. In fact having, at xvc level, a generic way to analyze stream is not efficient: for jlink, with a really similar protocol, these buffers may be sent directly, with ftdi in bitbang mode (ft232r, ft231x) and anlogic cable you have to send both tms & tdi more or less bit by bit -> using directly buffer content is, again, the good way. MPSSE is more or less a specific case where it's possible to do dictinction between tms & tdi so it make sense to have this analyze at probe level to adapt correctly the stream to the situation.

zhuangzard commented 2 years ago

Thanks for the update. Implanting to the lower level of the code definitely is a major change, a strategized planning for sure will will benefit this project a lot.

For XVC protocol, Vivado first send one getinfo: request, and the client will send back the xvc version and buffer size information. I do found your ftdi buffer has 512 bytes limitation, I did not change your lower level FTDI buffer code, just a quick note for you, when you implanting the code, make the buffer size is correct send back to Vivado or you could make your lower level FTDI buffer size adjustable for XVC class.

BTW, another crazy idea about the speed I have thought for days, I tested XVC to flash a SPI flash, the bin file is about 2.9MB, it takes me about 15-25mins utilizing diligent_hs2 programmer. If you just program into FPGA ROM, which will be just about 1-2mins.

I also did test utilizing VirtualHere to virtualize the USB port into my computer via a raspberry pi 4B board, the speed is SPI flash is about 12mins.

I think the the implantation of the code with XVC is suited for the applications like remote debugging or remote testing the code, if you would like the flash the SPI, direct flashing will be the best.

If for future, if we could make the remote system's USB port become virtualized and we could utilize open source project like USB/IP to make this project works more generic for other chips like Altera or GoWin for remote debug and programing, also very interesting topic to discuss.

trabucayre commented 2 years ago

I try,usually, to find the best way to avoid to have in future to rewrite/modify/rethink class (I have already done that for jtag.cpp to be less ftdi compliant and more generic). I have updated jlink with a method to receive and adapt buffer to send data accordingly to the protocol. XVC and getinfo is know (since 7bfce0fb2be45af587645dca09be87c6806bee6b openFPGALoader is able to communicate with an xvc server). I prefer to keep a generic size: 2048 seems to be the usual size (it's pagesize), building USB transaction must be done by lowlevel driver (it's already done to adapt jtag packet to devce)..

I have to read a bit more about USB/IP but it's already possible to use netcat to send configuration data through network.

trabucayre commented 2 years ago

Hi. I have pushed a first draft for XVC server side protocol (currently limited to be used with FTDI devices). I'm interested by feedbacks/remarks/complains Thanks

trabucayre commented 2 years ago

Any news? Can I close this issue? Thanks

nick-petrovsky commented 1 year ago

I am trying to make a remote XVC server and connect it to Vivado. Unfortunately not entirely successful. I am using FT2322 (Digilent Zybo board rev. 1). Cable, board and devices in JTAG chain is detected as expected.

I've tried following configurations:

sudo openFPGALoader -c digilent --verbose-level 2 --xvc --port 3121
sudo openFPGALoader -b zybo_z7_10 --verbose-level 2 --xvc --port 3121

All they are finished with error on connection attempt from Vivado:

Jtag frequency : requested 6.00MHz   -> real 6.00MHz
INFO: To connect to this xvcServer instance, use: TCP:small-rpi:3121

Press to quit
connection accepted - fd 9
setting TCP_NODELAY to 1

invalid cmd 'E'
connection closed - fd 9

Do I miss something?

nick-petrovsky commented 1 year ago

I realized my mistake, I've tried to connect using hw_server, but Virtual Cable is required. Then I tried to make xvc-client\server working on localhost, but still no luck.

This woking fine:

$ openFPGALoader -c digilent --board zybo_z7_10 --verbose-level 2 --bitstream ~/system_top.bit --freq 15000000
Jtag frequency : requested 15.00MHz  -> real 15.00MHz
Raw IDCODE:
- 0 -> 0x13722093
- 1 -> 0x4ba00477
- 2 -> 0xffffffff
- 3 -> 0xffffffff
- 4 -> 0xffffffff
found 2 devices
index 0:
        idcode   0x4ba00477
        type     ARM cortex A9
        irlength 4
index 1:
        idcode 0x3722093
        manufacturer xilinx
        family zynq
        model  xc7z010
        irlength 6
File type : bit
Open file DONE
Parse file DONE
bitstream header infos
date: 2022/12/09
design_name: system_top
hour: 20:53:18
part_name: 7z010clg400
toolVersion: 0XFFFFFFFF;Version=2020.1
userID: TRUE
load program
Flash SRAM: [==================================================] 100.00%
Done

Same from openFPGAloader xvc-server:

 $ openFPGALoader --cable  digilent  --board zybo_z7_10 --verbose-level 2 --xvc --port 2542 --freq 15000000               
Jtag frequency : requested 15.00MHz  -> real 15.00MHz
INFO: To connect to this xvcServer instance, use: TCP:small-rpi:2542

Press to quit
connection accepted - fd 9
setting TCP_NODELAY to 1

1672046873 : Received command: 'getinfo'
         Replied with xvcServer_v1.0:2048

connection closed - fd 9

Client side:

$ openFPGALoader --verbose-level 2 -c xvc-client --port 2542 ~/system_top.bit  --freq 15000000
received 20 Bytes (160)
        78 76 63 53 65 72 76 65 72 5f 76 31 2e 30 3a 32 30 34 38 0a
detected xvcServer version v1.0 packet size 1024

Somewhy it stoping in getinfo state. For vivado situation is the same.

nick-petrovsky commented 1 year ago

Okey, my appologizes for spaming you too much, the issue is partically demistifyed. There is exists some bugs with socket handling in xvc_server.cpp, look at this peace of code:

} else {
                    int ret = handle_data(fd);
                    printInfo("connection closed - fd " + std::to_string(fd));
                    close(fd);
                    FD_CLR(fd, &conn);
                    if (ret == 1)
                        throw std::runtime_error("communication failure");
                }

In the original XAPP the same behavour is achived with following code:

else if (handle_data(fd,ptr)) {

               if (verbose)
                  printf("connection closed - fd %d\n", fd);
               close(fd);
               FD_CLR(fd, &conn);
            }

I.e. connection is closed only if something goes wrong, not every time. With following patch I am able to detect devices on JTAG chain and flash the FPGA.

diff --git a/src/xvc_server.cpp b/src/xvc_server.cpp
index 93d5d96..52e5900 100644
--- a/src/xvc_server.cpp
+++ b/src/xvc_server.cpp
@@ -186,8 +186,10 @@ void XVC_server::thread_listen()
                 } else {
                                        int ret = handle_data(fd);
                                        printInfo("connection closed - fd " + std::to_string(fd));
+          if (ret) {
                     close(fd);
                     FD_CLR(fd, &conn);
+          }
                                        if (ret == 1)
                                                throw std::runtime_error("communication failure");
                                }

Commands and their output for verification:

$ ./openFPGALoader --verbose-level 2 --board zybo_z7_10 --port 2542  --xvc                                                                                                 

Jtag frequency : requested 6.00MHz   -> real 6.00MHz  
INFO: To connect to this xvcServer instance, use: TCP:helios:2542

Press to quit
connection accepted - fd 11
setting TCP_NODELAY to 1

1672068508 : Received command: 'getinfo'
     Replied with xvcServer_v1.0:2048

connection closed - fd 11
Jtag frequency : requested 15.15MHz  -> real 15.00MHz 
1672068508 : Received command: 'settck'
     Replied with 'B'

connection closed - fd 11
1672068508 : Received command: 'shift'
    Number of Bits  : 10
    Number of Bytes : 2

1672068508 : Received command: 'shift'
    Number of Bits  : 32
    Number of Bytes : 4

1672068508 : Received command: 'shift'
    Number of Bits  : 32
    Number of Bytes : 4

1672068508 : Received command: 'shift'
    Number of Bits  : 32
    Number of Bytes : 4

1672068508 : Received command: 'shift'
    Number of Bits  : 32
    Number of Bytes : 4

1672068508 : Received command: 'shift'
    Number of Bits  : 32
    Number of Bytes : 4

1672068508 : Received command: 'shift'
    Number of Bits  : 6
    Number of Bytes : 1

connection closed - fd 11
terminate called after throwing an instance of 'std::runtime_error'
  what():  communication failure
[1]    14218 abort (core dumped)  ./openFPGALoader --verbose-level 2 --board zybo_z7_10 --port 2542 --xvc

$ ./openFPGALoader --cable xvc-client --board zybo_z7_10 --fpga-part xc7z010clg400 --freq 15000000 --port 2542 --detect                               

Board default cable overridden with xvc-client
Board default fpga part overridden with xc7z010clg400
detected xvcServer version v1.0 packet size 1024
freq 15000000 66.666667 66 0
42 0 0 0
index 0:
    idcode   0x4ba00477
    type     ARM cortex A9
    irlength 4
index 1:
    idcode 0x3722093
    manufacturer xilinx
    family zynq
    model  xc7z010
    irlength 6

At this point a deeper debugging is needed, you can remove the exception, but its source is not obvious to me. Vivado still does not like this virtual cable. I will try to investigate why Vivado does not work with this implementation. I would like to have full remote debugging.

trabucayre commented 1 year ago

Hi, and sorry for the delay. It's true closing connection must be done if an error is present or when client close connection, not each time (it's weird I have tested the code before pushing it maybe a typo before the commit).

I have to retest this code entierely

trabucayre commented 1 year ago

I have updated xvc_server: now

I have only tested using openFPGALoader as xvc server and client. Could you try with vivado?

Thanks!

nick-petrovsky commented 1 year ago

Thank you for patching the xvc-server, code now looks very close to reference xilinx implementation! I can confirm that in my setup openFPGALoader works perfect locally, but still no luck with Vivado.

$ ./openFPGALoader --verbose-level 0 --board zybo_z7_10 --port 2542  --xvc --freq 15000000
Jtag frequency : requested 15.00MHz  -> real 15.00MHz
INFO: To connect to this xvcServer instance, use: TCP:small-rpi:2542

Press to quit
connection accepted - fd 9
setting TCP_NODELAY to 1

Jtag frequency : requested 10.00MHz  -> real 10.00MHz

Vivado requesting another JTAG frequency and produce an error that hardware isn't powered up.

connect_hw_server: Time (s): cpu = 00:00:02 ; elapsed = 00:00:08 . Memory (MB): peak = 1081.625 ; gain = 0.000
open_hw_target -xvc_url 192.168.104.21:2542
INFO: [Labtools 27-2285] Connecting to hw_server url TCP:localhost:3121
INFO: [Labtools 27-3415] Connecting to cs_server url TCP:localhost:3042
INFO: [Labtools 27-3414] Connected to existing cs_server.
INFO: [Labtoolstcl 44-466] Opening hw_target localhost:3121/xilinx_tcf/Xilinx/192.168.104.21:2542
ERROR: [Labtools 27-2269] No devices detected on target localhost:3121/xilinx_tcf/Xilinx/192.168.104.21:2542.
Check cable connectivity and that the target board is powered up then
use the disconnect_hw_server and connect_hw_server to re-register this hardware target.
ERROR: [Common 17-39] 'open_hw_target' failed due to earlier errors.
ERROR: [Labtoolstcl 44-513] HW Target shutdown. Closing target: localhost:3121/xilinx_tcf/Xilinx/192.168.104.21:2542
disconnect_hw_server localhost:3121

I have no good ideas how to debug this issue. How can I help to higlight the problem? In my point of view two options is exists:

1) Record the trace of commands from Vivado 2) Compare it with reference implementation (time consuming and pin soldering is required)

Previously @zhuangzard mentioned that he succeeded with Vivado, also I have tried his implementation and with few compiler depended fixes, but same errors occurs.

trabucayre commented 1 year ago

Could you provides command line used. I have tried with vivado 2019: indeed it's not working. I see openFPGALoader receives getinfo message, answer it, but nothing more... This piece of code is similar to others implementation: I have no idea why it's not working.

nick-petrovsky commented 1 year ago

I use Vivado in batch mode vivado -mode tcl after open_hw_server something like this command is required open_hw_target -xvc_url 192.168.104.21:2542 and same errors occurs like in upper message. Allmost the same behaviour happends when you add your target device in GUI.

In my case, Vivado performs transactions all the time, even when it writes an error that the device is not turned on, you can spectate this with more verbose mode.

I tried several other implementations and found a working one without any issues xvcd . Implementation from Berkeley XVC-FTDI-JTAG desn't work out of the box. Without any strong reason, I can come to preliminary conslusion that some reset logic is missing. XVC-FTDI-JTAG add option for pin direction & state changing with 100 ms resolution. Unfortunatly board documentation miss proper JTAG schematics to change the values in any conscious way.

trabucayre commented 1 year ago

Hi and sorry for the delay. After reading I have xvcd code, I have discovered this part. After small adaptation of my code: vivado as client is able to communicate with openFPGALoader as server. Currently not tried to program the FPGA but at least both are able to communicate. Once code cleaned I will push fix.

Thanks!

trabucayre commented 1 year ago

@nick-petrovsky I have pushed a commit with the fix to use Vivado. Thanks again

nick-petrovsky commented 1 year ago

@nick-petrovsky I have pushed a commit with the fix to use Vivado. Thanks again

I have checked last commit with remote openFPGAloader as XVC server: flashed PL of Zybo_1, everything related to FPGA looks fine (even using RPi1 SSH tunnelled to Linux host with Vivado). I haven't tried ILA or some JTAG related cores, but hope it will work (I will report it later).

But some issue is still exists: I'am not able to debug ARM core. I will double check my setup and compare with direct connection. I do not see limitations why it isn't able to communicate with the CPU. Vitis gives the following error:

Error while launching program: no targets found with "name =~"APU*"". available targets: 1 DAP (Cannot open JTAG port: Invalid DAP ACK value: 3) 2* xc7z010

trabucayre commented 1 year ago

It's great if your able to reproduce for the PL part. PS part seems weird, openFPGALoader do nothing specific: it just convert client requests to ftdi transaction, so I don't see why it's not possible to debug the PS core. Your board is configured in JTAG mode instead of QSPI or SDRAM? I have to try with my arty z7 board.

nick-petrovsky commented 1 year ago

It's great if your able to reproduce for the PL part. PS part seems weird, openFPGALoader do nothing specific: it just convert client requests to ftdi transaction, so I don't see why it's not possible to debug the PS core. Your board is configured in JTAG mode instead of QSPI or SDRAM? I have to try with my arty z7 board.

Of course I have configured the board correctly, the pin configuration is selected for JTAG bitstream load, i.e. after power on it is unconfigured. Interesting that with xvcd software debugging performed correctly.

trabucayre commented 1 year ago

Could you share your test procedure. Now, code seems quite equivalent to xvcd so maybe something is wrong at lowlevel driver (ie FTDI). Thanks

keegandent commented 6 months ago

Is this still being worked on? I have had difficulty running other XVC servers on my Mac and it would be fantastic if openfpgaloader implemented some sort of network protocol so I could automate a flow with vivado-on-silicon-mac. I would happily volunteer to add the network loader as a "pgm" backend for edalize.

nick-petrovsky commented 6 months ago

Is this still being worked on? I have had difficulty running other XVC servers on my Mac and it would be fantastic if openfpgaloader implemented some sort of network protocol so I could automate a flow with vivado-on-silicon-mac. I would happily volunteer to add the network loader as a "pgm" backend for edalize.

Last time when I tried, it worked without any issues on M1 with PL-part of zynq7010. Problem exist with ARM debugging.

keegandent commented 6 months ago

Last time when I tried, it worked without any issues on M1 with PL-part of zynq7010. Problem exist with ARM debugging.

Would it be possible to release the feature as-is but open a child issue for CPU debugging? I don’t mind helping with that as I have access to a Zynq Ultrascale. Just seems to me that there’s plenty benefit to being able to flash and debug fabric over this interface that warrants it being in the default build.

trabucayre commented 6 months ago

I have recently bought an (not too old) macbook (but based on x86 CPU). I have to verify if the protocol is working with this OS. Moving to an ARM processor isn't a matter once validated.

keegandent commented 6 months ago

I have recently bought an (not too old) macbook (but based on x86 CPU). I have to verify if the protocol is working with this OS. Moving to an ARM processor isn't a matter once validated.

The vivado-on-silicon-mac’s included xvcd seems to run fine on macOS, and my Vivado instance can connect to it. Unfortunately I can’t see any hardware targets in Vivado or change the JTAG frequency to see if that fixes anything. I’m wondering if you’ll see a similar issue or if it’s something specific to my setup. I’ll try building this project with XVC enabled when I get home and report my findings.

Hecatron commented 1 week ago

I've recently been hunting all around the web for a way to remotley program a Arty A7-100T board with an Rpi acting as the remote host and I have to say so far this seems to work quite nicely. I've not tested much other than programming the board with a demo (no advanced debugging or anything yet) but so far it seems to work using the xvc interface on Vivado 2024.1 from windows

With an rpi4 running Ubuntu aarch64 connected to the board

openFPGALoader -b arty_a7_100t --xvc

Thanks for all the hard work with this.