Xilinx / Vitis-AI-Tutorials

MIT License
358 stars 144 forks source link

Problem testing Desing_Tutorial/07-yolov4-tutorial's Yolov4 #28

Open rzese opened 2 years ago

rzese commented 2 years ago

Dear all, I am trying to follow the instruction in Desing_Tutorial/07-yolov4-tutorial to run the network on my board (DPUCZDX8G_ISA0_B4096_MAX_BG2). Up to step 2.4 (Model Deployment) everything seems to work fine and I am able to evaluate the network using the tf_eval_yolov4_coco_2017.py script. Here, the results are not good, but I get no errors. The quantization and compilation processes finish correctly. The problem is when I try to run the network on the board. Specifically, in the program test_jpeg_yolov4, the execution seems to stall when it gives the image as input to the network. I have read the code that executes the network, here is the code of the function:

// Entrance of jpeg demo
template <typename FactoryMethod, typename ProcessResult>
int main_for_jpeg_demo(int argc, char *argv[],
                       const FactoryMethod &factory_method,
                       const ProcessResult &process_result, int start_pos = 1) {
  if (argc <= 1) {
    usage_jpeg(argv[0]);
    exit(1);
  }
  auto model = factory_method();
  for (int i = start_pos; i < argc; ++i) {
    auto image_file_name = std::string{argv[i]};
    auto image = cv::imread(image_file_name);
    if (image.empty()) {
      LOG(FATAL) << "cannot load " << image_file_name << std::endl;
      abort();
    }
    auto result = model->run(image);
    image = process_result(image, result, true);
    auto out_file =
        image_file_name.substr(0, image_file_name.size() - 4) + "_result.jpg";
    cv::imwrite(out_file, image);
    LOG_IF(INFO, ENV_PARAM(DEBUG_DEMO)) << "result image write to " << out_file;
  }
  LOG_IF(INFO, ENV_PARAM(DEBUG_DEMO)) << "BYEBYE";
  return 0;
}

When it enters result = model->run(image); the program seems to enter an infinite loop. I tried to wait more than 24 hours to see if the execution could calculate the results, but the program never reaches the next instruction (image = process_result(image, result, true);).

What can cause this problem? Has anyone already experienced similar problems?

Many thanks

nhphuong91 commented 2 years ago

@rzese Hi, how did you pass step 2.1? It said to run a shell script name setup_environment.sh at the very beginning but I couldn't find it anywhere

rzese commented 2 years ago

Hi @nhphuong91, yes, the file setup_environment.sh does not exist. I simply created a virtual environment with the same version of python included in the vitis-ai docker and installed all the packages contained in the requirements.txt file.

nhphuong91 commented 2 years ago

@rzese Thanks! The requirements.txt you mention is located in repo keras-YOLOv3-model-set right? Or it's somewhere else?

rzese commented 2 years ago

Not a problem! Yes, the requirements.txt is that located in keras-YOLOv3-model-set

QULV-wyl commented 2 years ago

Hello, I also want to run the test_jpeg_yolov4 command in 2.4 Model Deployment, but I don't know how to input the picture as the input of the network. I would appreciate it if you could tell me the command format of this project.

2

nhphuong91 commented 2 years ago

@QULV-wyl you can read the readme file (locate in the same directory) for sample commands. The general form is ./test_... <name of model folder locates in /usr/share/vitis_ai_library/> <image file>

QULV-wyl commented 2 years ago

@QULV-wyl you can read the readme file (locate in the same directory) for sample commands. The general form is ./test_... <name of model folder locates in /usr/share/vitis_ai_library/> <image file> It works! I see it! Thank you very much!

bhargavin1872008 commented 1 year ago

when running the requirements.txt of keras-yolov3-modelset -i 'm getting error for coremltools.it is showing like "couldn't find a version that satisfies the requirement tensorflow<=1.14 and tensorflow >=1.5(from tfcoremltools -r requirements.txt).(from version :2.2.0,2.2..1, 2.2.2, ...2.7.0rc0,2.7.0.rc1............) like this .can someone help me regarding this. Also ,i have a doubt .can we use ubuntu 20.04 ,cuda 11.7 ,cudnn 8.4.0 for this project. or have to use ubuntu 18.04,cuda 10.0 only which only works.please help me regarding this,i have less time in my hand.

nhphuong91 commented 1 year ago

when running the requirements.txt of keras-yolov3-modelset -i 'm getting error for coremltools.it is showing like "couldn't find a version that satisfies the requirement tensorflow<=1.14 and tensorflow >=1.5(from tfcoremltools -r requirements.txt).(from version :2.2.0,2.2..1, 2.2.2, ...2.7.0rc0,2.7.0.rc1............) like this .can someone help me regarding this. Also ,i have a doubt .can we use ubuntu 20.04 ,cuda 11.7 ,cudnn 8.4.0 for this project. or have to use ubuntu 18.04,cuda 10.0 only which only works.please help me regarding this,i have less time in my hand.

Please follow up with conversation at #22

bhargavin1872008 commented 1 year ago

hello u people have used docker desktop or docker engine? what u did in place of running environment.sh?

bhargavin1872008 commented 1 year ago

also what version of tensorflow is suitable for the project.i.e 1.5 or 2.x versions.will be waiting for ur reply

nhphuong91 commented 1 year ago

@bhargavin1872008 I use docker engine. However, for the beginning part when running model training, I created a virtual environment (python) & train it outside docker container. It is possible to do inside vitis-ai docker container too but be sure to separate its virtual env from existing one. About TF ver., latest TF1.x is fine for me.

NguyenNgocAnh2610 commented 4 months ago

@nhphuong91 Can you help me with the quantization step? I did it but got an error

nhphuong91 commented 4 months ago

@nhphuong91 Can you help me with the quantization step? I did it but got an error

@NguyenNgocAnh2610 Can you pls describe your error as detailed as possible? What Vitis version? Which environment you are using? How you setup everything. Sometimes, issue appear because the git's owner update their content -> would have to switch back to older commit

NguyenNgocAnh2610 commented 4 months ago

@nhphuong91 e đang chạy thử yolov4 trên bo mạch AXU2CGB, e làm theo hướng dẫn đến bước compiler thì nó báo lỗi ko xác định được DPU. e đã sửa đường dẫn đến file json rồi mà không được. a cho e xin mail để e tiện trao đổi được không ạ

nhphuong91 commented 4 months ago

@nhphuong91 e đang chạy thử yolov4 trên bo mạch AXU2CGB, e làm theo hướng dẫn đến bước compiler thì nó báo lỗi ko xác định được DPU. e đã sửa đường dẫn đến file json rồi mà không được. a cho e xin mail để e tiện trao đổi được không ạ

@NguyenNgocAnh2610 e có thể nhắn a qua nhp12345@gmail.com nhé. Còn trên public space như github thì mình nên trao đổi bằng tiếng Anh, như vậy thì những người có cùng issue đến sau có thể tham khảo trao đổi của mình.