OCT-FPGA / udp-network-demo

UDP encrypt and decrypt example with pre-built network layer and cmac kernels
Other
5 stars 2 forks source link

[Err] Compiling UDP example with Vitis 2023.1 toolchain #3

Closed hecmay closed 1 year ago

hecmay commented 1 year ago

Hi,

This UDP example does not compile in Vitis 2023.1 and xilinx_u280_gen3x16_xdma_1_202211_1

  1. Vitis HLS 2023.1 hangs when pipelining a loop in txkrnl. The II violation is originated from here
WARNING: [HLS 200-880] The II Violation in module 'KeyExpansion_Pipeline_VITIS_LOOP_272_2' (loop 'VITIS_LOOP_272_2'): Unable to enforce a carried dependence constraint (II = 1, distance = 1, offset = 1) between 'load' operation ('temp[3]', /home/sx233/udp-network-demo/user_krnl/src/./aes/AESfunctions.cpp:245->/home/sx233/udp-network-demo/user_krnl/src/./aes/AESfunctions.cpp:279) on array 's_box' and 'mux' operation ('temp[0]', /home/sx233/udp-network-demo/user_krnl/src/./aes/AESfunctions.cpp:275).
Resolution: For help on HLS 200-880 see www.xilinx.com/cgi-bin/docs/rdoc?v=2023.1;t=hls+guidance;d=200-880.html
  1. To bypass the II violation issue, I switched to a reduced UDP example without any AES encryption and decryption. Here is my repo for the reduced UDP example: https://github.com/hecmay/fpga-network/tree/465a80572e797a26da02a6f4f43a828f7fe2c441

However, this still won't compile. Here is the error message from vpl

****** vpl v2023.1 (64-bit)
  **** SW Build 3860322 on 2023-05-04-06:32:48
    ** Copyright 1986-2022 Xilinx, Inc. All Rights Reserved.
    ** Copyright 2022-2023 Advanced Micro Devices, Inc. All Rights Reserved.

INFO: [VPL 60-839] Read in kernel information from file '/home/sx233/fpga-network/build_hw_if3/link/int/kernel_info.dat'.
INFO: [VPL 74-78] Compiler Version string: 2023.1
INFO: [VPL 60-423]   Target device: xilinx_u280_gen3x16_xdma_1_202211_1
INFO: [VPL 60-1032] Extracting hardware platform to /home/sx233/fpga-network/build_hw_if3/link/vivado/vpl/.local/hw_platform
[15:08:13] Run vpl: Step create_project: Started
Creating Vivado project.
[15:08:20] Run vpl: Step create_project: Completed
[15:08:20] Run vpl: Step create_bd: Started
[15:08:53] Run vpl: Step create_bd: Failed
[15:08:54] Run vpl: FINISHED. Run Status: create_bd ERROR
ERROR: [VPL 60-773] In '/home/sx233/fpga-network/build_hw_if3/link/vivado/vpl/runme.log', caught Tcl error:  can't read "bd_gt_gtyquad_0": no such variable
ERROR: [VPL 60-704] Integration error, Failed to update block diagram in project required for hardware synthesis.  The project is 'prj'. The user supplied update script is '/home/sx233/fpga-network/post_sys_link.tcl'.  The update script was set using param 'compiler.userPostSysLinkOverlayTcl' An error stack with function names and arguments may be available in the 'vivado.log'.
ERROR: [VPL 60-1328] Vpl run 'vpl' failed
WARNING: [VPL 60-1142] Unable to read data from '/home/sx233/fpga-network/build_hw_if3/link/vivado/vpl/output/generated_reports.log', generated reports will not be copied.
ERROR: [VPL 60-806] Failed to finish platform linker
INFO: [v++ 60-1442] [15:08:54] Run run_link: Step vpl: Failed
Time (s): cpu = 00:00:08 ; elapsed = 00:00:56 . Memory (MB): peak = 467.535 ; gain = 0.000 ; free physical = 5663 ; free virtual = 67389
ERROR: [v++ 60-661] v++ link run 'run_link' failed
ERROR: [v++ 60-626] Kernel link failed to complete
ERROR: [v++ 60-703] Failed to finish linking
INFO: [v++ 60-1653] Closing dispatch client.
make: *** [Makefile:106: build_hw_if3/demo_if3.xclbin] Error 1

So it seems like this post_sys_link.tcl post linking script has some problems.

Not fully sure about the reason, but it seems that vpl failed to locate the correct port in the target block design, and thus it cannot connect the QSFP GT pins to the target block?

Do you have any ideas how to fix this? @surangamh

Thanks!

surangamh commented 1 year ago

@hecmay This example has not yet been tested with 2023.1. My suggestions to fix the issue.

  1. Download the xup vitis network example and build its cmac and network kernels with Vitis 2023.1
  2. replace the existing cmac, network layer kernels with the ones generated in step 1.
  3. You should also replace the existing synthesis_results_HBM directory with synthesis_results_HBM.
  4. Use the post_sys_link.tcl of the Xilinx repository instead of the one in our repository.

You may also have to specify the xilinx_u280_gen3x16_xdma_1_202211_1 platform in your makefile. Try to build after making these changes.

ngdxzy commented 1 year ago

@hecmay I encountered this before, I solved it in the same way Suranga suggested and it worked well.

hecmay commented 1 year ago

@surangamh @ngdxzy thanks for the suggestions! Sure, I will give it a try.

surangamh commented 1 year ago

@ngdxzy Thanks for confirming. I assume you've also made the necessary changes in the host code to reflect the changes in register offsets, etc. If so feel free to share your code with us. Thanks!

ngdxzy commented 1 year ago

Oh! Yes, I did change some offset addresses. I actually separated the network configuration and user logic into two executables. Here is the udp_setup.cpp file and the header file that contains the rectified offset addresses. I didn't write too much comments. Please let me know if you have any questions.

oct_fpga.hpp:

#ifndef __OCT_FPGA_HPP__
#define __OCT_FPGA_HPP__

#include <string.h>
#include <xrt/xrt_device.h>
#include <xrt/xrt_bo.h>
#include <xrt/xrt_kernel.h>
#include <experimental/xrt_ip.h>

typedef unsigned long uint64_t;
typedef unsigned int uint32_t;
typedef unsigned short uint16_t;

typedef struct {
    uint32_t their_ip;
    uint16_t their_port;
    uint16_t my_port;
    bool valid;
} socket_type;

// Number of bytes per UDP packet
const unsigned int BYTES_PER_PACKET     = 1408;
// ARP control registers in UDP kernel
const unsigned int ARP_DISCOVERY        = 0x1010;
const unsigned int ARP_IP_ADDR_OFFSET   = 0x1400;
const unsigned int ARP_MAC_ADDR_OFFSET  = 0x1800;
const unsigned int ARP_VALID_OFFSET     = 0x1100;
// Self network inforation registers
const unsigned int IP_ADDR_OFFSET       = 0x0018;
const unsigned int GATEWAY_OFFSET       = 0x001C;
const unsigned int MAC_ADDR_OFFSET      = 0x0010;
const unsigned int NUM_SOCKETS_HW       = 0x0A10;
const unsigned int UDP_TI_OFFSET        = 0x0810;
const unsigned int UDP_TP_OFFSET        = 0x0890;
const unsigned int UDP_MP_OFFSET        = 0x0910;
const unsigned int UDP_VA_OFFSET        = 0x0990;
// constexpr std::size_t mac_address_offset = 0x0010;
// constexpr std::size_t ip_address_offset = 0x0018;
// constexpr std::size_t gateway_offset = 0x001C;
// constexpr std::size_t arp_discovery_offset = 0x1010;
// constexpr std::size_t arp_mac_addr_offset = 0x1800;
// constexpr std::size_t arp_ip_addr_offset = 0x1400;
// constexpr std::size_t arp_valid_offset = 0x1100;
// constexpr std::size_t udp_theirIP_offset = 0x0810;
// constexpr std::size_t udp_theirPort_offset = 0x0890;
// constexpr std::size_t udp_myPort_offset = 0x0910;
// constexpr std::size_t udp_valid_offset = 0x0990;
// constexpr std::size_t udp_number_sockets = 0x0A10;
// constexpr std::size_t udp_in_packets = 0x04D0;
// constexpr std::size_t udp_out_packets = 0x0500;
// constexpr std::size_t udp_app_in_packets = 0x0518;
// constexpr std::size_t udp_app_out_packets = 0x04E8;
// 
// constexpr std::size_t udp_in_bytes = 0x04C8;
// constexpr std::size_t udp_out_bytes = 0x04F8;
// constexpr std::size_t udp_app_in_bytes = 0x0510;
// constexpr std::size_t udp_app_out_bytes = 0x04E0;
// 
// constexpr std::size_t ethhi_out_bytes = 0x0498;
// constexpr std::size_t eth_out_bytes = 0x04B0;

class cmac{
private:
    char ip_name[64];
    bool arp_valid = {0};
    xrt::ip inst;
    xrt::device* belonging_device;
    xrt::uuid* belonging_uuid;

public:
    cmac(xrt::device& belonging_device, xrt::uuid& belonging_uuid, const char* ip_name){
        strcpy(this->ip_name, ip_name);
        this->inst = xrt::ip(belonging_device, belonging_uuid, this->ip_name);
    }

    // get tx status
    unsigned int get_tx_status(){
        this->inst.read_register(0x0200); // redundent read for some reason..
        return this->inst.read_register(0x0200);
    }
    // get rx status
    unsigned int get_rx_status(){
        this->inst.read_register(0x0204); // redundent read for some reason..
        return this->inst.read_register(0x0204);
    }
};

class network_layer{
private:
    char ip_name[64];
    xrt::ip inst;
    xrt::device* belonging_device;
    xrt::uuid* belonging_uuid;
    bool is_self_ip_address_set;
    bool arp_valid[256];
public:
    network_layer(xrt::device& belonging_device, xrt::uuid& belonging_uuid, const char* ip_name){
        strcpy(this->ip_name, ip_name);
        this->inst = xrt::ip(belonging_device, belonging_uuid, this->ip_name);
        is_self_ip_address_set = false;
    }

    // set self ip address
    int set_self_ip(uint32_t ip, uint64_t mac, uint32_t gateway){
        inst.write_register(MAC_ADDR_OFFSET, (uint32_t)(mac & 0xFFFFFFFF));
        inst.write_register(MAC_ADDR_OFFSET + 4, (uint32_t)((mac >> 32) & 0xFFFFFFFF));
        inst.write_register(IP_ADDR_OFFSET, ip);
        inst.write_register(GATEWAY_OFFSET, gateway);
        is_self_ip_address_set = true;
        return 0;
    }    

    // get number of hardware sockets
    int get_hardware_sockets_number(){
        return inst.read_register(NUM_SOCKETS_HW);
    }

    // set socket
    int set_socket(int id, socket_type socket_info ){ 
        if (is_self_ip_address_set == false){
            return -1;
        }
        uint32_t their_ip_offset = UDP_TI_OFFSET + id * 8;
        uint32_t their_port_offset = UDP_TP_OFFSET + id * 8;
        uint32_t my_port_offset = UDP_MP_OFFSET + id * 8;
        uint32_t valid_offset = UDP_VA_OFFSET + id * 8;

        inst.write_register(their_ip_offset, socket_info.their_ip);
        inst.write_register(their_port_offset, socket_info.their_port);
        inst.write_register(my_port_offset, socket_info.my_port);
        if (socket_info.valid){
            inst.write_register(valid_offset, 0xFFFFFFFF);
        }
        else{
            inst.write_register(valid_offset, 0);
        }

        return 0;
    }

    // enable socket
    int enable_socket(int id){
        if (is_self_ip_address_set == false){
            return -1;
        }
        if (id == -1) {
            int num_of_sockets = get_hardware_sockets_number();
            for (int i = 0; i < num_of_sockets; i++){
                uint32_t valid_offset = UDP_VA_OFFSET + id * 8;

                inst.write_register(valid_offset, false);
            }
        }
        else{
            uint32_t valid_offset = UDP_VA_OFFSET + id * 8;

            inst.write_register(valid_offset, true);
        }
        return 0;
    }

    // disable socket
    int disable_socket(int id){
        if (is_self_ip_address_set == false){
            return -1;
        }

        uint32_t valid_offset = UDP_VA_OFFSET + id * 8;

        inst.write_register(valid_offset, false);

        return 0;
    }

    // ARP 
    void arp_discover(){
        // clear
        for (int i = 0; i < 64; i++){ // 256 entry, one take one byte
            inst.write_register(ARP_VALID_OFFSET + (i << 2), 0);
            arp_valid[i] = false;
        }

        // trigger arp
        inst.write_register(ARP_DISCOVERY, 0);
        inst.write_register(ARP_DISCOVERY, 1);
        inst.write_register(ARP_DISCOVERY, 0);

        for (int i = 0; i < 64; i++){
            uint32_t val = inst.read_register(ARP_VALID_OFFSET + (i << 2));
            for (int j = 0; j < 4; j++){
                bool valid_entry = val & 1;
                if (valid_entry){
                    printf("ARP valid entry found at %d\n", i * 4 + j);
                    arp_valid[i * 4 + j] = true;
                }
                val = val >> 8;
            }
        }
    }
};

#endif

udp_setup.cpp

#include <iostream>
#include <stdio.h>
#include <xrt/xrt_device.h>         // bitstream
#include <xrt/xrt_bo.h>             // buffers
#include <xrt/xrt_kernel.h>         // kernels, runs
#include <experimental/xrt_ip.h>    // IP direct control
#include "oct_fpga.hpp"

// My IP address: 192.168.1.1 
// Their IP address: 192.168.1.2 
// GATEWAY: 192.168.1.255
#define MY_IP_ADDR 0xC0A80102
#define THEIR_IP_ADDR 0xC0A80101
#define IP_GATEWAY 0xC0A801FF

// #define DEBUG

// udp_setup <bit_container.xclbin> MY_IP THEIR_IP MP TP SOCKET_ID

int main(int argc, char* argv[]){

    char* xclbinFilename;

    unsigned int my_ip;

    int cmac_id = 0;

    if (argc < 2){
        printf("Usage: %s <XCLBIN File> \n", argv[0]);
        return EXIT_FAILURE;
    }
    else{
        xclbinFilename = argv[1];
        printf("Using FPGA binary file specfied through the command line: %s \n", xclbinFilename);
    }

    FILE* conf_file;
    if (argc > 2){
        conf_file = fopen(argv[2], "r");
    }
    else{
        conf_file = fopen("net_config.ini", "r");
    }

    if (argc > 3){
        cmac_id = atoi(argv[3]);
    }
    printf("Setting up CMAC %d\n", cmac_id);

    if (conf_file == NULL){
        printf("Cannot read conf file!\n");
        return -1;
    }

    unsigned char* mp = (unsigned char*) (&my_ip);
    fscanf(conf_file, "%hhu.%hhu.%hhu.%hhu", mp+3, mp+2, mp+1, mp);
    printf("Get user specified IP address: %x\n", my_ip);

    socket_type sockets[16] = {};

    for (int i = 0; i < 16; i++){
        unsigned int their_ip;
        unsigned short their_port;
        unsigned short my_port;
        unsigned char* p = (unsigned char*) (&their_ip);
        if (EOF != fscanf(conf_file, "%hu:%hhu.%hhu.%hhu.%hhu:%hu",&my_port, p+3, p+2, p+1, p, &their_port)){
            printf("Get %02dth socket conncection %hhu.%hhu.%hhu.%hhu:%hu <->  %hhu.%hhu.%hhu.%hhu:%hu\n", i, *(mp+3), *(mp+2), *(mp+1), *(mp+0), my_port, *(p+3), *(p+2), *(p+1), *(p+0), their_port);
            sockets[i].their_port = their_port;
            sockets[i].their_ip = their_ip;
            sockets[i].my_port = my_port;
            sockets[i].valid = true;
        }
        else{
            break;
        }
    }

    long mac_address = (0xf0f1f2f3f4f5 & 0xFFFFFFFFF00) + (my_ip & 0xFF);
    unsigned int gateway = my_ip | 0xFF;
    printf("Using gateway: %x\n", gateway);
    printf("Using MAC: %lx\n", mac_address);

#ifndef DEBUG
    // Load xclbin 
    printf("Loading %s \n", xclbinFilename);

    unsigned int device_id = 0; // by default
    auto device = xrt::device(device_id);

    std::cout << "Device name:  " <<  device.get_info<xrt::info::device::name>() << std::endl;
    std::cout << "Device bdf:   " <<  device.get_info<xrt::info::device::bdf>() << std::endl;

    xrt::uuid overlay_uuid = device.load_xclbin(xclbinFilename);

    char cmac_ip_name[64];
    char network_ip_name[64];

    if (cmac_id == 0){
        sprintf(cmac_ip_name, "cmac_0");
        sprintf(network_ip_name, "networklayer:{networklayer_0}");
    }
    else{
        sprintf(cmac_ip_name, "cmac_1");
        sprintf(network_ip_name, "networklayer:{networklayer_1}");
    }

    cmac cmac_inst = cmac(device, overlay_uuid, cmac_ip_name);

    network_layer network_inst = network_layer(device, overlay_uuid, network_ip_name);

    network_inst.set_self_ip(my_ip, mac_address, gateway);

    network_inst.enable_socket(-1); // disable all sockets

    int num_of_sockets = network_inst.get_hardware_sockets_number();

    printf("There are %d hardware sockets.\n", num_of_sockets);

    // define user sockets 

    for (int i = 0; i < 16; i++){
        network_inst.set_socket(i, sockets[i]);
    }

    network_inst.arp_discover();

    unsigned int tx_status = 0;
    unsigned int rx_status = 0;
    tx_status = cmac_inst.get_tx_status();
    rx_status = cmac_inst.get_rx_status();

    printf("TX status %d\n", tx_status);
    printf("RX status %d\n", rx_status);

    if (rx_status & 0x01){
        printf("Link is as active!\n");
    }
    else{
        printf("Link is not active!\n");
    }
#endif

    return 0;

}

example configuration file:

192.168.144.10
50000:192.168.144.30:60000
50001:192.168.144.30:60001
50002:192.168.144.20:60002
50003:192.168.144.20:60003
ngdxzy commented 1 year ago

Just checked the xilinx_xup_project and I am a little confused that it seems like we are using the same offset addresses. Maybe the official repository has been updated? I ran my code in the 2023.1 version recently and it worked well. So maybe we don't have to worry about the offset address problems.

surangamh commented 1 year ago

@ngdxzy You are right. The register offsets were changed in the 2021.2 update, and they haven't been changed since then. Based on your code, it appears you switched to using the XRT native API to write and read registers instead of OpenCL, if I understand correctly.

ngdxzy commented 1 year ago

@surangamh Yes, I am not very familiar with the OpenCL. The Xilinx XRT native API is more straightforward to me.

surangamh commented 1 year ago

@ngdxzy I don't think OpenCL works with the latest version of XRT. @hecmay Please be sure to use XRT native API as shown in @ngdxzy's code when you access registers.

surangamh commented 1 year ago

@hecmay I've created a new branch called '2023.1' that works well with Vitis 2023.1. Feel free to test it out.