Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.49k stars 633 forks source link

examples/caffe/ssd-detect not working #9

Closed matlinsas closed 4 years ago

matlinsas commented 4 years ago

Hi, I failed to import the vai library with the CPU container : xdock:5000/vitis-ai 1.0.0-cpu .

from vai.dpuv1.tools.compile.bin.xfdnn_compiler_caffe import CaffeFrontend as xfdnnCompiler Traceback (most recent call last): File "", line 1, in ModuleNotFoundError: No module named 'vai'

Following is the $PYTHONPATH information:

/workspace/alveo/overlaybins/setup.sh

Using VAI_ALVEO_ROOT

/workspace/alveo


Using LD_LIBRARY_PATH

/opt/xilinx/xrt/lib:

Using LIBXDNN_PATH

/lib/libxfdnn.so


PYTHONPATH

/workspace/alveo:/workspace/alveo/apps/yolo:/workspace/alveo/apps/yolo/nms:/workspace/alveo/xfmlp/python:/opt/xilinx/xrt/python:/workspace/alveo:/workspace/alveo/apps/yolo:/workspace/alveo/apps/yolo/nms:/workspace/alveo/xfmlp/python:/opt/vitis_ai/compiler


Verifying XILINX_XRT

XILINX_XRT : /opt/xilinx/xrt PATH : /opt/xilinx/xrt/bin:/opt/xilinx/xrt/bin:/opt/vitis_ai/conda/bin:/opt/vitis_ai/utility:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin LD_LIBRARY_PATH : /opt/xilinx/xrt/lib:/opt/xilinx/xrt/lib: PYTHONPATH : /opt/xilinx/xrt/python:/workspace/alveo:/workspace/alveo/apps/yolo:/workspace/alveo/apps/yolo/nms:/workspace/alveo/xfmlp/python:/opt/xilinx/xrt/python:/workspace/alveo:/workspace/alveo/apps/yolo:/workspace/alveo/apps/yolo/nms:/workspace/alveo/xfmlp/python:/opt/vitis_ai/compiler

Thanks, Bean

matlinsas commented 4 years ago

And the jupyter is not installed in this docker image.

/workspace/alveo$ jupyter notebook --no-browser --ip=0.0.0.0 --NotebookApp.token='' --NotebookApp.password='' bash: jupyter: command not found

matlinsas commented 4 years ago

After active the caffe conda environment by following commands:

source activate vitis-ai-caffe

I can import above library in the python.

But I try the caffe example at following directory: /workspace/alveo/examples/caffe/ssd-detect

python run_ssd.py --prototxt /opt/models/caffe/inception_v2_ssd/inception_v2_ssd_train.prototxt --caffemodel /opt/models/caffe/inception_v2_ssd/inception_v2_ssd.caffemodel --prepare

It reports the

I1206 05:13:37.123778 694 layer_factory.hpp:123] Creating layer data I1206 05:13:37.123811 694 net.cpp:140] Creating Layer data I1206 05:13:37.123821 694 net.cpp:455] data -> data I1206 05:13:37.123843 694 net.cpp:455] data -> label I1206 05:13:37.123865 694 image_data_layer.cpp:87] Opening file /opt/ml-suite/examples/caffe/ssd-detect/calib.txt I1206 05:13:37.123926 694 image_data_layer.cpp:102] A total of 0 images. GenerateCode: work/compiler Weights: quantize_results/deploy.caffemodel PngFile: None ConcatStrategy: None Strategy: all ScheduleFile: None DDR: 256 DSP: 96 Verbose: False FromTF: True Memory: 9 Byte per Pixel: 1 Phase: TEST RankDir: BT Start compiling quantize_results/deploy.prototxt

BUILDING DATA FLOW GRAPH

Traceback (most recent call last): File "run_ssd.py", line 244, in Compile() File "run_ssd.py", line 51, in Compile compiler.compile() File "/opt/vitis_ai/conda/envs/vitis-ai-caffe/lib/python3.6/site-packages/vai/dpuv1/tools/compile/bin/xfdnn_compiler_caffe.py", line 188, in compile S = changeinplace.read_file(self.args.networkfile) File "/opt/vitis_ai/conda/envs/vitis-ai-caffe/lib/python3.6/site-packages/vai/dpuv1/tools/compile/optimizations/changeinplace.py", line 41, in read_file F = open(filename, "r") FileNotFoundError: [Errno 2] No such file or directory: 'quantize_results/deploy.prototxt'

It seems that the quantize step didn't generate any results in the quantize_results. *Any suggestion?

Thanks, Bean.

wilderfield commented 4 years ago

Hi @matlinsas,

In regards to your most recent post. We are looking into it. There is a problem with ssd-detect, we need to make some minor fixes to the code, and polish the instructions. We apologize for the issue. We should have it fixed by the end of the week.

@satyakee1 is working on this.

tarun28jain commented 4 years ago

In replace_mluser.py under substitute function, after writing values to file, file handle needs to be closed so that changes are written to the file. Then again run "python replace_mluser.py --modelsdir models" after setting $VAI_ALVEO_ROOT in env and prepare command will proceed then. After then it asks for overlays "/workspace/alveo/overlays/xdnnv3" which is not present here. Actually it is present in directory "/opt/xilinx/overlaybins/xdnnv3". Need to copy it here for prepare command to work successfully

tarun28jain commented 4 years ago

I am facing issue in next command while running inference on entire dataset

My card is getting validated successfully (vitis-ai-caffe) root@ubuntu:/workspace/alveo/examples/caffe/ssd-detect# /opt/xilinx/xrt/bin/xbutil validate INFO: Found 1 cards

INFO: Validating card[0]: xilinx_u200_xdma_201830_2 INFO: == Starting AUX power connector check: INFO: == AUX power connector check PASSED INFO: == Starting PCIE link check: INFO: == PCIE link check PASSED INFO: == Starting verify kernel test: INFO: == verify kernel test PASSED INFO: == Starting DMA test: Buffer Size: 256 MB Host -> PCIe -> FPGA write bandwidth = 10438.6 MB/s Host <- PCIe <- FPGA read bandwidth = 12147.1 MB/s INFO: == DMA test PASSED INFO: == Starting device memory bandwidth test: ............ Maximum throughput: 52428 MB/s INFO: == device memory bandwidth test PASSED INFO: == Starting PCIE peer-to-peer test: P2P BAR is not enabled. Skipping validation INFO: == PCIE peer-to-peer test SKIPPED INFO: == Starting memory-to-memory DMA test: bank0 -> bank1 M2M bandwidth: 11975.5 MB/s bank0 -> bank2 M2M bandwidth: 12082.9 MB/s bank0 -> bank3 M2M bandwidth: 12081.7 MB/s bank1 -> bank2 M2M bandwidth: 12082.3 MB/s bank1 -> bank3 M2M bandwidth: 12082.3 MB/s bank2 -> bank3 M2M bandwidth: 12081.2 MB/s INFO: == memory-to-memory DMA test PASSED INFO: Card[0] validated successfully.

But on running inference with, i am getting this error Speaking to Butler Response from Butler is: errCode: errCode: 18 errCode String: XCLBIN_DOWNLOAD_ERROR myHandle: 0 valid: 1

Butler connection failed, exiting after too many errors python: /usr/include/boost/thread/pthread/mutex.hpp:111: boost::mutex::~mutex(): Assertion `!res' failed. Aborted (core dumped)

Any suggestions on how to proceed or what to check??

wilderfield commented 4 years ago

This should be fixed, please open a new issue if you are still experiencing problems