andre1araujo / YOLO-on-PYNQ-Z2

This repository contains all the necessary material to implement a YOLOv3 object detection algorithm on the PYNQ-Z2 FPGA. There is a step-by-step tutorial associated so everyone can do it.
https://andre-araujo.gitbook.io/yolo-on-pynq-z2/
Apache License 2.0
35 stars 8 forks source link

Mobaxterm issue #10

Closed MaRcOsss1 closed 1 month ago

MaRcOsss1 commented 4 months ago

image

Why am I seeing this?

Also when I tried to login it showed this: image

though after entering the password I was on this window: image

root@pynqz2_dpu1:~# ls
install.sh  pkgs  samples
root@pynqz2_dpu1:~# cd zynq7020_dnndk_v3.1
-sh: cd: zynq7020_dnndk_v3.1: No such file or directory
root@pynqz2_dpu1:~# ./install.sh
-sh: ./install.sh: Permission denied
andre1araujo commented 4 months ago

Hi! It looks like the SD card image you are using does not include the DNNDK drivers. The zynq7020_dnndk_v3.1 folder contained them but I don't see that folder on there. You can dowload it here on this google drive. Then drag the folder to the left side of mobaXterm and it should be there ready to be accessed. If the permission is denied use the chmod +x zynq7020_dnndk_v3.1 command and try again. Good luck!

P.S.: I mentioned on the tutorial that you might need to insert the password "root" twice as mobaXterm for some reason doesn't identify it the first time.

MaRcOsss1 commented 4 months ago

okay I did that

image

this seems alright right? should I ignore /etc/modules: No such file or directory ?

andre1araujo commented 4 months ago

I am not sure. Ignore it for now and see if it does not pop up on a later stage. In any case I really don't think it will be a problem.

MaRcOsss1 commented 4 months ago

image

UMM, how do I install it?

root@pynqz2_dpu1:~# cd yolo_pynqz2
root@pynqz2_dpu1:~/yolo_pynqz2# make
make: Warning: File 'Makefile' has modification time 1200 s in the future
g++ -c -O2 -Wall -Wpointer-arith -std=c++11 -ffast-math -mcpu=cortex-a9 -mfloat-abi=hard -mfpu=neon programs/yolo_image.cpp -o objects/yolo_image.o
make: g++: **Command not found**
make: *** [Makefile:40: objects/yolo_image.o] **Error 127**
MaRcOsss1 commented 4 months ago

So I have looked for every possible solution I could find, I even downloaded the plugin from mobaxterm official website image

and placed it in the right folder: image

The development.mxt3 is the plugin file for GCC and G++, but its still not working

andre1araujo commented 4 months ago

Look, the problem seems to be that you don't have the correct SD card image. Try to use the one I made available as it already has all the necessary libraries. gcc should be one of those packages and so, as it is not present there, it means the SD car image is not correct. Download it here and then flash it on the micro SD card. Good luck!

MaRcOsss1 commented 4 months ago

Certainly, I will do that and inform you accordingly. By the way, I have a question regarding the implementation of ML, specifically YOLO, on PYNQ (Z1/Z2). I have reviewed numerous projects and documents, but many of them use a different approach than yours. For instance, some have created an IP in Vivado HLS for YOLO, loaded it onto the board, and then used Jupyter to run the model. While I don't fully understand that process, I am curious about the rationale behind your approach. Could you please explain the reasoning and any potential advantages of your method? And also can you explain the flow used in this https://iopscience.iop.org/article/10.1088/1742-6596/2405/1/012011/pdf and this https://github.com/dhm2013724/yolov2_xilinx_fpga, if you know about it. Thank you.

On Fri, Jul 26, 2024 at 2:05 AM André Araújo @.***> wrote:

Look, the problem seems to be that you don't have the correct SD card image. Try to use the one I made available as it already has all the necessary libraries. gcc should be one of those packages and so, as it is not present there, it means the SD car image is not correct. Download it here https://drive.google.com/file/d/1ETyM51KSWX_h1DVq9ptHPIux89oTrNPy/view?usp=drive_linkand then flash it on the micro SD card. Good luck!

— Reply to this email directly, view it on GitHub https://github.com/andre1araujo/YOLO-on-PYNQ-Z2/issues/10#issuecomment-2251348930, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5RZEIUDY6CPI5BCMTTFZ6LZOFOPHAVCNFSM6AAAAABLOX6YMOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRGM2DQOJTGA . You are receiving this because you authored the thread.Message ID: @.***>

andre1araujo commented 4 months ago

Thank you! I also added you on the end of the github page as a thank you for your help on this project! About the way I did this project, it was based on wutianze's work. I was assigned to do this project and after months, I found his work and tried to replicate it. For someone with basically no experience as me, I had contact him on some parts, had to trie to look for other simillar implementations and so on. What I am basically saying is that even I don't fully understand the flow haha. In general, it goes like this:

Sorry for the long explanation. Both the approaches you linked use soemething called HLS which is High Level Synthesis. I never went on that area but what I think it is, is basically program the FPGA using High Level code like c++. So, you create a c++ program for a neural network you want and then point some parts where the program can be parallel, just like multithreading. This approach is not typically as optimized as programming in VHDL or Verilog your own neural network but surelly is a lot easier and faster. Their process is not as different as mine as it might seem:

I think the process of using HLS is a broad area of its own and might be really complicated. I don't have that experience so I can't tell for sure. But the process is a little different than what I did but kind of the same on some points haha. The DPU solution from Xilinx seems to be the easiest for unexperienced people like me. It is generally a lot more easy on other boards with Zynq Ultrascale+ chips because now Vitis AI and DPU are a lot more optimized and easier to use. The challenge for the PYNQ-Z2 and other zynq-7000 familly chips is that Vitis AI and the DPU are no longer supported for them and that is why I used DNNDK which was a older version. My masters degree thesys might be to create a better and more accessible solution for these boards as they seem to be forgotten by Xilinx on the CNN acceleration. My approach will not be HLS but RTL. RTL includes more specific languages like VHDL and Verilog. Essentially do the same as what the people on the paper did with HLS but more optimized. But we will see,

Sorry about the long text but I hope you understand a little bit the different approaches. There are no good (accessible) solutions for deploying object detection in real time on these boards yet.

Good luck!

MaRcOsss1 commented 4 months ago

Haha, I don't think there was any need to include my name, since you had already completed the project and I was just learning. I had many doubts along the way, and you helped me each time. I could say I was the Andre to your Wutianze. I had been working on this project for a long time; when I started, I didn't even know how to use Vivado and generate IP. Over time, I learned to use HLS and Verilog/Verilog HDL to create IPs for image resizing and other tasks.

Initially, I couldn't figure out the flow for implementing object identification on an FPGA. I knew I needed to create an IP and a model, and that I could use Jupyter to access it, but I wasn't sure how to proceed. I talked to several seniors, and most had written Verilog code or used the HLS method, writing YOLO v2 code in C++ because it was available online and widely used.

My goal was to implement any model on an FPGA, so I was exploring different flows. I encountered a method that involved converting the trained model into a hardware-understood file, essentially HDL, and then calling an API on Jupyter to implement the model. I wasn't sure how this worked or if it would work at all. What I don't understand is how a trained model is deployed on the board, IP is just for the neural network, but how are the model weights being dumped on the board is something I haven't figured out till date in the HLS or RTL process flows.

Then, I came across your GitHub and read the complete documentation, which gave me some clarity. It seems both flows (yours and the HLS one) aim to achieve the same purpose. The difference might be that, in your flow, you are creating an OS for the entire system, focusing on object detection. In the HLS flow, it seems like one has to write code for every task on Jupyter to run the model.

When you mention RTL, do you mean writing the neural network using RTL?

Regards Aryan Shah

andre1araujo commented 4 months ago

Don't worry, you helped me validate the instructions of the tutorial and that is really helpfull. I didn't know you had some experience with HLS but that is good to know because I might disturb you one day with that haha. Yes, I meant writting the neural network in RTL. Seems like a suicide mission but lets try haha. There are lots of work on this field and I will try to implement the most used architectures and see where it goes. I didn't finish the research but it seems that the CNN accelerators follow all some simillar architectures. Because Xilinx never broght the DPU back to the Zynq-7000 chip, I am going to try and make a object detector with newer optimization techniques to see if I can achieve some decent real time performance on this low cost board.

MaRcOsss1 commented 4 months ago

I have only a basic understanding of HLS, and I definitely can't write a TensorFlow architecture with it. But feel free to ask, I'll be happy to help where I can. I really don't like Verilog; it's just not my thing, so writing neural network code in RTL is out of the question for me. I'm curious about the methods used in newer boards. I have no knowledge of them yet, but I'm open to exploring. Are there newer or better optimization techniques available for the Zynq 7000 SoC or any other board, other than RTL?

Also, in the YOLO test image you posted here https://andre-araujo.gitbook.io/yolo-on-pynq-z2/deployment-on-pynq-z2/execute-yolov3, do the numbers on the boxes represent accuracy or some other metric?

Regards Aryan Shah

Message ID: @.***>

andre1araujo commented 4 months ago

The optimization techniques normally used to compress neural networks is quantization and prunning. DNNDK did not use prunning. They explicitly said it was not supported on that version. I suspect they now use prunning on Vitis AI. There are some frameworks for accelerating neural networks on the Zynq-7000 chips like FINN from xilinx or Brevitas also from Xilinx. They are more easy to work with but I suspect the results will be slighly worse. But we never know!

About the numbers above the boxes on the detections, it represents the accuracy of the detection. 0,99 is 99% accuracy. I hope this helps!

MaRcOsss1 commented 3 months ago

Hey Andre, Deployment on PYNQ z2 board has been successful however we have to repeat this https://andre-araujo.gitbook.io/yolo-on-pynq-z2/dpu-implementation/implementing-the-dpu-on-a-sd-card-image#sd-card-image steps, after every time we power off the board. Like for example if I have copied those image files on SD card, then setup USB connection using putty, and everything is working fine even on MobaXterm. But after few runs if I power off the board and remove all systems and try again connecting to putty or MobaXterm that doesn't work. I will have to format the SD card again , repeat the step mentioned here https://andre-araujo.gitbook.io/yolo-on-pynq-z2/dpu-implementation/implementing-the-dpu-on-a-sd-card-image#sd-card-image, and then again setup the whole system then it works. Does that make sense?

Is that normal? Do I have to do that again and again every time I want to perform the task?

Regards Aryan Shah

On Mon, Jul 29, 2024 at 8:14 PM André Araújo @.***> wrote:

The optimization techniques normally used to compress neural networks is quantization and prunning. DNNDK did not use prunning. They explicitly said it was not supported on that version. I suspect they now use prunning on Vitis AI. There are some frameworks for accelerating neural networks on the Zynq-7000 chips like FINN https://github.com/Xilinx/finnfrom xilinx or Brevitas https://github.com/Xilinx/brevitas also from Xilinx. They are more easy to work with but I suspect the results will be slighly worse. But we never know!

About the numbers above the boxes on the detections, it represents the accuracy of the detection. 0,99 is 99% accuracy. I hope this helps!

— Reply to this email directly, view it on GitHub https://github.com/andre1araujo/YOLO-on-PYNQ-Z2/issues/10#issuecomment-2256136463, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5RZEIQIYQIICIO7PPIHLHDZOZIM7AVCNFSM6AAAAABLOX6YMOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJWGEZTMNBWGM . You are receiving this because you authored the thread.Message ID: @.***>

andre1araujo commented 3 months ago

Hello Aryan. No, that is not normal behaviour at all. The image shuld work evry time. I don't see why that problem is happening, but I might suspect that due to poor description in the tutorial, you are not executing these commands each time you power up the board:

cd zynq7020_dnndk_v3.1
./install.sh

If that is the case let me know. If you already did those commands every time you restart the board and the problem pressists let me know so we can try to figure out the problem. Good luck!

MaRcOsss1 commented 3 months ago

The problem is I am not able to establish connection after powering it up again, like putty glitches up, can't reconnect even on MobaXterm so I am not even able to go further and run those commands.

On Fri, Aug 9, 2024 at 12:49 PM André Araújo @.***> wrote:

Hello Aryan. No, that is not normal behaviour at all. The image shuld work evry time. I don't see why that problem is happening, but I might suspect that due to poor description in the tutorial, you are not executing these commands each time you power up the board:

cd zynq7020_dnndk_v3.1 ./install.sh

If that is the case let me know. If you already did those commands every time you restart the board and the problem pressists let me know so we can try to figure out the problem. Good luck!

— Reply to this email directly, view it on GitHub https://github.com/andre1araujo/YOLO-on-PYNQ-Z2/issues/10#issuecomment-2277308286, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5RZEIT3YWHAOVEKOE2XLWDZQRUQ7AVCNFSM6AAAAABLOX6YMOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZXGMYDQMRYGY . You are receiving this because you authored the thread.Message ID: @.***>

andre1araujo commented 3 months ago

That is really strange. I never saw that behaviour before. Can you confirm that the green DONE LED lights up? It probably is something wrong with the SD card image. Can you try it with a different micro SD card? If you see what LEDs light when the board crahses, it may help identify the cause. I will have to look it up.

MaRcOsss1 commented 3 months ago

Can you just tell me what "zynq7020_dnndk_v3.1" is and where it came from? Like how did you generate that file? I am using the file from your github itself.

![Uploading image.png…]()

andre1araujo commented 3 months ago

Hi! Sorry for the late response. That zynq7020_dnndk_v3.1 folder comes from the DNNDK package you downloaded for the second chapter of the tutorial. It should contain a folder related to the devices and one of them is the "ZedBoard". I used that folder because that board has the same chip as the PYNQ-Z2. Then I renamed the folder to "zynq7020_dnndk_v3.1". That contains the libraries for DNNDK. The problem you have might be related to something on the process of the SD card image build or the problem you mentioned on this issue. I can try to look that up but it will be easier for you to make experiments. Sorry about that. I will try to think about the problem as soon as i can and then I make a suggestion. Good luck!

andre1araujo commented 1 month ago

Hi! I will be closing this issue due to inactivity. If you need further assistance make sure to open a new one or just contact me. Thank you for understanding! :)