dgschwend / zynqnet

Master Thesis "ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network"
GNU General Public License v3.0
743 stars 297 forks source link

Questions about the results of output on FPGA #43

Open lishen565 opened 6 years ago

lishen565 commented 6 years ago

Hi, I've run the zynqnet on Xilinx ZC706 successfully. The result of top-5 is puzzling me though I've set the AXI HP port to 32 bits. Did you meet this situation before ? Thanks a lot. 4

achiless123 commented 6 years ago

@lishen565 hi,lishen565,I'm also interested in use this net on zynqboard. But I met lot of difficulties. Maybe we can communite with each other. I really need your help. (Maybe in Chinese? I'm not good at English)

lishen565 commented 6 years ago

Hi, now I‘m doubt whether the zynqnet is running successfully on Xilinx ZC706. I checked the code and I’m not sure the FPGA has finished its computing work. Do I need to regenerate new fsbl.elf based on the exported hardware in Xilinx SDK ? Or fsbl.elf has any relationship with the .bitstream?

dgschwend commented 6 years ago

Hi! That looks almost like random data being interpreted as floats... regarding the FPGA having finished it‘s computation: it seems like the layers take ~140ms; that‘s the actual runtime between the start signal being asserted and the busy signal going away, so there is something happening... can you compare the contents of SHARED_DRAM before and after the run? Maybe initialize to zero and then check that all layers produce some results?

dgschwend commented 6 years ago

Regarding the FSBL: I think the AXI width is set up there, so you would need to generate a customized FSBL for the bitstream. Or set the AXI width via the appropriate registers from the running Linux. How are you doing it now?

lishen565 commented 6 years ago

Hi,I've compared the content of SHARED_DRAM_DATA by xilinx SDK at three places which are 1) after calling allocate_DRAM_memory (func that initializing SHARED_DRAM_DATA firstly); 2) after calling copy_input_image_to_DRAM (func that put indata.bin into SHARED_DRAM_DATA); 3) after calling copy_results_from_DRAM. They are different as you guessed. I have changed the iteration from 100 to 1 inorder to get the output quickly. I put breakpoint at some places in SDK when comparing the content of SHARED_DRAM_DATA. Now the result is as below: 5 Then I tried the work by command line in Serial tools software, the output is different with SDK: 6 So I guess the my change of AXI_HP port 64bit ->32bit by vivado block design didn't work. Maybe I should generate a customized FSBL for the bitstream. By the way, could you tell me which command can config the registers of AXI_HP port when petalinux is running on zc706 without a customized FSBL?

dgschwend commented 6 years ago

See e.g. https://github.com/dgschwend/zynqnet/issues/2#issuecomment-266667085

dgschwend commented 6 years ago

The configuration of AXI bus width might also be possible using the /sys/class/xdevcfg driver. We recently used that to configure the clock rate in another Zynq project. I'll attach the corresponding files. YMMV... setup_fpga.txt

dgschwend commented 6 years ago

Just found how to change the AXI bus width from running Linux (without recompiling FSBL): https://github.com/RedPitaya/RedPitaya/issues/89#issuecomment-267846075

lishen565 commented 6 years ago

Hi, I've debugged zynqnet in Xilinx SDK and found some puzzles in mem view. The original indata.bin in mem is below: default After calling copy_results_from_DRAM(results, ch_out) the "results" in mem is below: 1 It seems that FPGA did nothing about the high 32bit data. The 32bit S_AXI_HP0 port width is set as below: 3 Do you have any idea to help me solve this problem? Thanks.

dgschwend commented 6 years ago

Looks like a 32b/64b AXI width problem to me, too. Have you re-synthesized, exported to the Xilinx SDK, and built the FSBL after setting the ACI width to 32bit? And are you booting this new FSBL + kernel now? I would recommend you try to read the corresponding registers from the running Linux (see my latest post) to see if it‘s set to 32b. And if needed, you can set the AXI width from running Linux.

lishen565 commented 6 years ago

Hi, I have used the mmap function in cpu_top.c and configured the AXI_HP port from 64 bit to 32 bit. I set the FPGA computing iteration from 100 to 1. But every time the results are not the same and they are floating near 88.38. Could you tell me if it is correct and why it ranges? It takes about 5000+ms to run once on zc706, whose fpga_clk may be configured as 100Mhz(I'm not very sure about my config). How long does it take in your zynqbox and how fast is your fpga_clk? If I wanna to set the fpga_clk as the same method of configuring AXI_HP port, which register address should I write? It was metioned in

2 . Thanks a lot.

dgschwend commented 6 years ago

Congratulations, seems like it‘s almost running.

The result should be deterministic and should not fluctuate. It might have to do with timing... For what clock speed did you synthesize the design? Did you reach timing closure?

I don‘t know the register address for the clock (FCLK0?). But you can certainly find it in one of the Zynq datasheets.

Am 16.04.2018 um 17:02 schrieb lishen565 notifications@github.com:

Hi, I have used the mmap function in cpu_top.c and configured the AXI_HP port from 64 bit to 32 bit. I set the FPGA computing iteration from 100 to 1. But every time the results are not the same and they are floating near 88.38. Could you tell me if it is correct and why it ranges? It takes about 5000+ms to run once on zc706, whose fpga_clk may be configured as 100Mhz(I'm not very sure about my config). How long does it take in your zynqbox and how fast is your fpga_clk? If I wanna to set the fpga_clk as the same method of configuring AXI_HP port, which register address should I write? It was metioned in

2 . Thanks a lot.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

lishen565 commented 6 years ago

Hi , I've set the FCLK_CLK0 as 100MHz, then the results is about 5000+ms running once(iteration is 1 in for loop, not 100). 4 I've searched the Zynq datasheet, but there is no address info of the FCLK0_CLK. Another question, we have run the zynqnet model by caffe in Raspberry Pi which only costs about 0.3s,but the code in _FIRMWARE would take about 30s in Raspberry Pi. Could you tell me why it's so different on the same hardware platform?

lishen565 commented 6 years ago

If I set the FCLK0_CLK to 200MHz, the vivado would fail in implementation. The err info is about the slack just as #2 mentioned.

dgschwend commented 6 years ago

The register addresses are here: http://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf But it‘s not very easy to set the clocks via registers (see e.g. http://eliaskousk.teamdac.com/entry/fsbl-changes-needed-or-not-week-6-of-gsoc-2016). Maybe you should rather try to set the clock via /sys/class/xdevcfg (the device config driver) as I mentioned in https://github.com/dgschwend/zynqnet/issues/43#comment-380746633.

Am 18.04.2018 um 04:30 schrieb lishen565 notifications@github.com:

If I set the FCLK0_CLK to 200MHz, the vivado would fail in implementation. The err info is about the slack just as #2 mentioned.

— You are receiving this because you commented. Reply to this email directly, , or mute the thread.

dgschwend commented 6 years ago

Regarding execution speed of the C code: this is a HLS description of an FPGA design (plus an executable test-bench), and not C code optimized for runtime. There is no efficient use of the cache, no parallelization, no vectorization, ...

Am 18.04.2018 um 04:30 schrieb lishen565 notifications@github.com:

If I set the FCLK0_CLK to 200MHz, the vivado would fail in implementation. The err info is about the slack just as #2 mentioned.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

lishen565 commented 6 years ago

Hi,I have tried to set register FPGA0_CLK_CTRL and found that the FPGA_clk might be 50MHz after petalinux boot(even I had set it to 100Mhz in my VIVADO block design), and the zynqnet ran about 5000+ms(100Mhz). After setting to 200MHz in register, it ran about 2100ms+ though the result is NaN. Another puzzle is still about the 32bit(64bit ->32bit) enable operation by setting the 0xf8008000 and 0xf8008014. I config the register at the entry of cpu_top.c by way of using mmap. After mmap and setting enable bit(I read it after setting and checked the new value is right), I found mmap is not always work and the first few times the result is always NaN . Once a result is correct, subsequent results are correct. Do you know why this phenomenon happens? Thanks.

Gengyuling commented 6 years ago

@lishen565 excuse me, I have encountered the problem as your description at first, but when I set the 0xf8008000 and 0xf8008014, the result became NaN. Do you know the reason? By the way, I had set my project to 100Mhz in my VIVADO block design. How can I do to correct it? Thank you very much.

PSlearner commented 6 years ago

@lishen565 excuse me, I have encountered the problem as your description at first, but when I set the 0xf8008000 and 0xf8008014, the result became NaN. Do you know the reason? By the way, I had set my project to 100Mhz in my VIVADO block design. How can I do to correct it? Thank you very much.

你好,请问你的zynqnet的项目跑通了吗,我的结果也是NaN,完全不知道咋调

wangj346 commented 6 years ago

Trying to reset the clock frequency can help , but I still cannot handle it well. Maybe it is the bug of the code.

PSlearner commented 6 years ago

I use the devmem command to write the register to reset the clock frequency, but I find the register can not be write successfully.

wangj346 commented 6 years ago

I meet the same problem and I advise you to ask the author.

发送自 Windows 10 版邮件https://go.microsoft.com/fwlink/?LinkId=550986应用


发件人: Chen Weiguang notifications@github.com 发送时间: Tuesday, November 20, 2018 10:10:57 PM 收件人: dgschwend/zynqnet 抄送: wangj346; Comment 主题: Re: [dgschwend/zynqnet] Questions about the results of output on FPGA (#43)

I use the devmem command to write the register to reset the clock frequency, but I find the register can not be write successfully.

― You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/dgschwend/zynqnet/issues/43#issuecomment-440286204, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AbnGmK8JLcRcBVGqU30n4wqHIOLMxcRVks5uxA1xgaJpZM4TOIGE.

PSlearner commented 6 years ago

OK. Thank you very much

wangj346 commented 6 years ago

I heard that some guys adjust the clock frequency successfully by using the petalinux, which have the command like set clock. You can try it. Good luck.

发送自 Windows 10 版邮件https://go.microsoft.com/fwlink/?LinkId=550986应用


发件人: Chen Weiguang notifications@github.com 发送时间: Tuesday, November 20, 2018 10:14:48 PM 收件人: dgschwend/zynqnet 抄送: wangj346; Comment 主题: Re: [dgschwend/zynqnet] Questions about the results of output on FPGA (#43)

OK. Thank you very much

― You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/dgschwend/zynqnet/issues/43#issuecomment-440287472, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AbnGmLGecnPWYZrXGAmy5YSTGXuJQmLqks5uxA5YgaJpZM4TOIGE.

PSlearner commented 6 years ago

I used the petalinux to package the linux kernel. I wil have a try!

PXT846765038 commented 5 years ago

I used the petalinux to package the linux kernel. I wil have a try!

请问后来跑通了么...?用CPU能跑出正确结果,用FPGA就全是Nan,,,

PSlearner commented 5 years ago

我后来跑出来也全是Nan,估计不是时钟和HP 位宽的问题,

wangj346 commented 5 years ago

当时我记得调了vivado工程的信号,还有时钟频率才把nan消除。但是过得比较久,记不太清了,需要你自己摸索下。

发送自 Windows 10 版邮件https://go.microsoft.com/fwlink/?LinkId=550986应用


发件人: PXT846765038 notifications@github.com 发送时间: Monday, April 29, 2019 11:23:49 AM 收件人: dgschwend/zynqnet 抄送: wangj346; Comment 主题: Re: [dgschwend/zynqnet] Questions about the results of output on FPGA (#43)

I used the petalinux to package the linux kernel. I wil have a try!

请问后来跑通了么...?用CPU能跑出正确结果,用FPGA就全是Nan,,,

― You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/dgschwend/zynqnet/issues/43#issuecomment-487442673, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AG44NGEZRUY3UXDVDIG4AOTPSZS4LANCNFSM4EZYQGCA.

ALEX5874 commented 3 years ago

Hi, I've run the zynqnet on Xilinx ZC706 successfully. The result of top-5 is puzzling me though I've set the AXI HP port to 32 bits. Did you meet this situation before ? Thanks a lot. 4

hi, I am making some work to improve the zynqnet, and I have met some questions as you have ever met, can you give me some advice?