Implementation of ZynqNet in ZCU102

Hello, I'm trying to implement ZynqNet in the zynq MPSOC Ultrascale+. I have generated the IP module of fpga_top successfully in HLS and I have synthetized the project in Vivado taking into account the considerations of axi ports (32 bits) and the clock frecuency (100 MHz). I am using Petalinux to execute the application generated in SDK (.elf), where I am executing it along with the files indata.bin and weights.bin. The issue appears here, I am having the same problem that in #43 . The result in the output is a NaN. I have read the registers of the axi width (read and write channels) and the clock frecuency in the FPGA while Petalinux is running and everything is correct, the ports are set to 32 bits and the clock frecuency to 100 MHz. I have introduced an ILA in the block design in Vivado to capture the data through the axi port, all the reads are correct. In the write channel for the first data of the first layer I have correct transactions (in comparision with the model in HLS), but in some point the data start to be only NaN. Even in some cases all the data (including the data where the results of the convolution of the first layer are stored) are NaN. Am I ignoring some important detail to implement ZynqNet, or I should take into account some details because of zynq MPSOC Ultrascale+ is a 64 bit device? Thanks a lot.

debug The time of execution of the convolutional layers is between 10-50 ms and the global pool is 77 ms of each one.

dgschwend / zynqnet

Implementation of ZynqNet in ZCU102 #53