fpgasystems / Coyote

Framework providing operating system abstractions and a range of shared networking (RDMA, TCP/IP) and memory services to common modern heterogeneous platforms.
MIT License
207 stars 62 forks source link

set data_width=8,network hls error #21

Closed crizy closed 1 year ago

crizy commented 1 year ago

after set(DATA_WIDTH 8 CACHE STRING "Data width"),When I make the shell, the HLS integration of the network stack will report an error. The error message is as follows:

(1)ERROR: [HLS 200-70] Compilation errors found: In file included from hw/services/network/hls/ip_handler/ip_handler.cpp:1: hw/services/network/hls/ip_handler/ip_handler.cpp:539:2: error: no matching function for call to 'ip_handler_compute_ipv4_checksum' ip_handler_compute_ipv4_checksum(ipDataMetaFifo, ipDataCheckFifo, iph_subSumsFifoOut); ^~~~~~~~ (2)hw/services/network/hls/ip_handler/ip_handler.cpp:656:2: note: in instantiation of function template specialization 'ip_handler<64>' requested here ip_handler(s_axis_raw, (3)hw/services/network/hls/ip_handler/../ipv4/ipv4.hpp:361:6: note: candidate function not viable: no known conversion from 'hls::stream<net_axis<64> >' to 'hls::stream<net_axis<512> > &' for 1st argument; void ip_handler_compute_ipv4_checksum( hls::stream<net_axis<512> >& dataIn,

d-kor commented 1 year ago

Are you explicitly setting this DATA_WIDTH parameter? This is not one of the parameters that are exported to higher levels. Coyote currently supports only boards that are able to run at 100+G, this in turn implies that only DATA_WIDTH of 64 is supported.

crizy commented 1 year ago

Are you explicitly setting this DATA_WIDTH parameter? This is not one of the parameters that are exported to higher levels. Coyote currently supports only boards that are able to run at 100+G, this in turn implies that only DATA_WIDTH of 64 is supported.

thank you very much for your reply, yes, I Implicit setting DATA_WIDTH parameter from 64 to 8 at bottom level, I have no boards that support 100g network, only 10g boards, so I want to try on 10g boards

d-kor commented 1 year ago

We dropped the support for older boards with 10g interfaces long time ago, especially as Coyote mostly targets data center deployments, but it shouldn't take a lot of effort to bring this back. In theory only a simple width conversion is necessary. What setup are you using?

crizy commented 1 year ago

We dropped the support for older boards with 10g interfaces long time ago, especially as Coyote mostly targets data center deployments, but it shouldn't take a lot of effort to bring this back. In theory only a simple width conversion is necessary. What setup are you using?

thank you very much for your reply, I tried to build a network application. You are right. I was also looking at some scripts in the project. I found that I needed to modify some script content. Now I am also trying to modify the script

crizy commented 1 year ago

@d-kor hi, Now I have changed some scripts and found that the hw/services/network can generate the corresponding 10G HLS IP core. The script I changed is as follows, (1) In lines 50~51, Coyote/hw/CMakeLists.txt set(DATA_WIDTH 8 CACHE STRING "Data width") set(CLOCK_PERIOD 6.4 CACHE STRING "Clock period.")

(2) In lines 15~20,Coyote/hw/services/network/CMakeLists.txt set(NETWORK_BANDWIDTH 10 CACHE STRING "Network bandwidth.")

set bandwidth

set(NETWORK_INTERFACE 10 CACHE STRING "Network interface.")
set(DATA_WIDTH 8 CACHE STRING "Data width")
set(CLOCK_PERIOD 6.4 CACHE STRING "Clock period.")

add in line 36:
add_ subdirectory(hls/ethernet_frame_padding)

(3) in 534~535 line ,Coyote/hw/services/network/hls/ip handler/ip handler. cpp //ip handler rshiftWordByOctet<net_ axis, WIDTH, 1>(((ETH_HEADERSIZE%WIDTH)/8), ipv4ShiftFifo, ipDataFifo); //ip handler rshiftWordByOctet<net axis, WIDTH, 3>(((ETH_HEADERSIZE%WIDTH)/8), ipv6ShiftFifo, ipv6DataFifo); rshiftWordByOctet<net axis, WIDTH, 1>(((ETH_HEADERSIZE%WIDTH)/8), ipv4ShiftFifo, ipDataFifo); rshiftWordByOctet<net axis, WIDTH, 3>(((ETH_HEADER_SIZE%WIDTH)/8), ipv6ShiftFifo, ipv6DataFifo);

and in 539 lines:
    //ip_ handler_ compute_ ipv4_ checksum(ipDataMetaFifo, ipDataCheckFifo, iph_subSumsFifoOut);
    compute_ ipv4_ checksum(ipDataMetaFifo, ipDataCheckFifo, iph_subSumsFifoOut);

(4)in 128 line ,Coyote/hw/scripts/wr_hdl/template_gen/lynx_pkg.txt parameter integer AXI_NET_BITS = 64;

I don't know whether the network hls ip is generated correctly after this change, and whether the interface data width of the network protocol stack is completely 64Bit after this change

d-kor commented 1 year ago

You will need to convert this to 64-byte bus at some point. Ideally, you should change the network_module.sv to sink/src a 64-byte bus. For this you can use AXI4 stream data width converters which can be instantiated as IP cores in your design.

crizy commented 1 year ago

You will need to convert this to 64-byte bus at some point. Ideally, you should change the network_module.sv to sink/src a 64-byte bus. For this you can use AXI4 stream data width converters which can be instantiated as IP cores in your design.

hi @d-kor, As you said, I added 512 to 64 and 64 to 512 data width conversion IP in the network module. sv (10g ip and interface) module, I executed the following example: perf rdma Host and perf rdma_ Card, successfully generated a bit file. How can I test these two examples?

crizy commented 1 year ago

hi @d-kor, when i make examples/perf_rdma, the following errors are exposed :

sw/src/cService.cpp:257:75: error: expected primary-expression before ‘(’ token el.second->scheduleTask(std::unique_ptr(new cTask(int32_t 0, int32_t 0, uint32_t 1, taskIter->second, msg))); ^ /home/ubuntu/Coyote_lasttest/pro_2022/Coyote_bit/sw/src/cService.cpp:257:80: error: expected type-specifier before ‘cTask’ el.second->scheduleTask(std::unique_ptr(new cTask(int32_t 0, int32_t 0, uint32_t 1, taskIter->second, msg)));

d-kor commented 1 year ago

Which c++ version are you using? This looks like you are stuck on some older version. As stated in the guide, you should have at least c++17 installed.

crizy commented 1 year ago

hi @d-kor, Thank you very much for your response, When I replace gcc7.5 with gcc11, build perf rdma_ Sw make succeeded, I would like to ask how to perform board level testing and verification for rdma examples, What do the examples which build_perf_rdma_card_hw and build_perf_rdma_host_hw represent?

crizy commented 1 year ago

hi @d-kor, At present, I have encountered several problems (1)When I build the HW RDMA host perf example, the project finally generates a bit file, but there are key warnings: inst_roce_stack/rocev2_inst has unconnected pin s_axis_rx_data_TSTRB[0]~[63] inst_roce_stack/rocev2_inst has unconnected pin s_axis_mem_read_dataTSTRB[28]~[63] (2)When i check roce stack hls generated, the pin signal s of roce ip axis rx/tx The data signal has more TSTRB, but there is no TSTRB signal at the network stack top of the project (3)I tried to do the hls csim of ROCE, but the simulation failed (4) when i make driver with gcc11, generate coyote_drv.ko ,but the warning is as follows, warning: ‘init_module’ specifies less restrictive attribute than its target ‘coyote_init’: ‘cold’ [-Wmissing-attributes] warning: ‘cleanup_module’ specifies less restrictive attribute than its target ‘coyote_exit’: ‘cold’ [-Wmissing-attributes]

       when i make driver with gcc7.5, generate coyote_drv.ko , no warming 

        This seems to be related to the Linux kernel,My system kernel version is /usr/src/linux-headers-4.18.0-15-generic and kernel‘s gcc version is 7.3.0,I should use that version of gcc to compile the driver?
d-kor commented 1 year ago

(1) and (2) These warnings are fine, the strobe interface is not used. (3) Not really sure what fails here, will have to check, we simulated RoCE fine on our side. (4) This is kernel version related, the warnings shouldn't be critical.

The two examples are used to measure the throughput and latency performance of base RDMA operations (READ and WRITE). To run these benchmarks you will need at least two machines connected over some form of 100G (or in your case 10G) link.

crizy commented 1 year ago

hi @d-kor, Thank you very much for your answer (1)The steps of my roce simulation are as follows,vitis 2022.1 cd hls/rocev2 && mkdir build && cd build cmake .. -DFPGA_PART= make csim/csim.rocev2 The following error is prompted, test_ib_transport.cpp:75:5: error: no matching function for call to ‘ib_transport_protocol<DATA_WIDTH, 1>.... (2) about board level testing and verification for rdma examples As you said, it is to form a network loop for testing, assume there are two hosts:host0 and host1 a,host0 + fpga(with bit of build_perf_rdma_host_hw or build_perf_rdma_card_hw and ) <-> 100/10g Optical network cable <-> nic card with rdma( or nic card no rdma + soft rdma stack) + host1
b, host0 + fpga(with bit of build_perf_rdma_host_hw ) <-> 100/10g Optical network cable <-> fpga(with bit of build_perf_rdma_card_hw) + build_perf_rdma_card_sw + host1
Which of the two ways is right?

crizy commented 1 year ago

hi @d-kor, when i run perf_tcp application , the following errors are exposed, build_perf_tcp_sw$ sudo ./main terminate called after throwing an instance of 'std::runtime_error' what(): Local IP address not provided Aborted