LucaFos commented 3 years ago

Feature request

Description

Proceeding with https://discourse.ros.org/t/unified-ml-inference-in-autoware-a-proposal/14058 plan, we need an implementation for a generic inference pipeline using TVM.

Implementation considerations

The pipeline will be generic, the preprocessing and postprocessing stages will need to be implemented by the user.

Additional information

A validation test will also be provided, running the pipeline on yolo_v2_tiny.

LiyouZhou commented 3 years ago

All dendpency of Autoware need to be supplied as a rosdep package. hence TVM_Runtime needed to be added. Ideally, we want to build a single binary that can support multiple hardware configurations.

As part of this work, we need to do the following validation.

[ ] Build a TVM binary with CUDA and Vulcan backends turned on.
[ ] copy this binary into a system that lacks libcuda or libvulkan,
[ ] build a tvm example using CPU only inference. Make sure it still works.
[ ] copy this binary into a system that lacks libvulkan,
[ ] build a tvm example using CUDA inference. Make sure it still works.
[ ] copy this binary into a system that lacks libcuda,
[ ] Tweak TVM Cli to support vulkan target
[ ] build a tvm example using Vulkan inference. Make sure it still works.

@JWhitleyWork can you check to see if we are on the same page?

JWhitleyWork commented 3 years ago

@LiyouZhou Yes, this looks correct to me. I'm still working on the ROS package. Please see this TVM config file which enables CUDA, Vulkan, OpenGL, and OpenCL. This will currently only work on ROS Foxy / Ubuntu Focal because Vulkan is not available in Ubuntu Bionic. If using in Bionic, set USE_VULKAN to OFF.

ambroise-arm commented 3 years ago

@JWhitleyWork Using modelzoo to build the libtvm runtime shared object and to compile yolo_v2_tiny, I get to the following results.

build tvm binary with cuda: done
copy to a systen without cuda: done
build a tvm example using CPU only: error

$ make 
[ 50%] Linking CXX executable example_pipeline
/usr/bin/ld: warning: libcudart.so.10.1, needed by //usr/local/lib/libtvm_runtime.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libcuda.so.1, needed by //usr/local/lib/libtvm_runtime.so, not found (try using -rpath or -rpath-link)
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaEventCreate@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cuLaunchKernel'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaGetErrorString@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaMemcpyAsync@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaStreamCreate@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaStreamSynchronize@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaSetDevice@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaFreeHost@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cuModuleGetFunction'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaMemcpy@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaStreamWaitEvent@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaGetDevice@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaFree@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaEventDestroy@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cuModuleUnload'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cuMemsetD32_v2'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cuModuleLoadData'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cuGetErrorName'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaMemcpyPeerAsync@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaDeviceGetAttribute@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaMallocHost@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaStreamDestroy@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaMalloc@libcudart.so.10.1'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cuDeviceGetName'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cuModuleGetGlobal_v2'
//usr/local/lib/libtvm_runtime.so: undefined reference to `cudaEventRecord@libcudart.so.10.1'
collect2: error: ld returned 1 exit status
CMakeFiles/example_pipeline.dir/build.make:135: recipe for target 'example_pipeline' failed
make[2]: *** [example_pipeline] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/example_pipeline.dir/all' failed
make[1]: *** [CMakeFiles/example_pipeline.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

LiyouZhou commented 3 years ago

@JWhitleyWork it looks like we still need to build multiple configurations for this package. Is this supported in the ROS distribution model?

If CUDA is enabled, tvm v0.7 is preferred as it has more complete support. v0.7 has api change from v0.6 so if you don't mind, can we start with 0.7 so we don't have to go through the upgrade process once committed. See this for a v0.7 installation script.

JWhitleyWork commented 3 years ago

@LiyouZhou We can certainly do this. So, I would create 3 different branches each with a different config.cmake. I would set the value to USE_BLAS to openblas but otherwise only change the following for each configuration:

CPU only (no changes to default config.cmake other than USE_BLAS)
CUDA only (changing USE_CUDA to ON)
Vulkan Only (changing USE_VULKAN to ON)

Each of these would generate a separate package with a different name such as tvm_vendor_cpu, tvm_vendor_cuda, and tvm_vendor_vulkan. The binaries are pretty small after compilation so I don't think this would be problematic. Does this work for you? Any additional targets you require?

LiyouZhou commented 3 years ago

@JWhitleyWork I think this is very good to start with.

JWhitleyWork commented 3 years ago

@LiyouZhou @LucaFos @ambroise-arm I'm working on the vendor package at https://github.com/autowarefoundation/tvm_vendor. I have installations for melodic, dashing, and foxy working. The build needs additional testing in the form of:

Create a ROS workspace with just tvm_vendor and tvm_util (from #13) instead of a from-source install of TVM.
Build the workspace and make sure that tvm_util builds correctly against tvm_vendor.
Run a pipeline that depends on tvm_vendor instead of a from-source install of TVM.

Additional note: You will need to prefix includes with tvm_vendor. e.g. #include <tvm_vendor/tvm/runtime...>

LiyouZhou commented 3 years ago

@ambroise-arm will be doing this testing in the next couple of days.

JWhitleyWork commented 3 years ago

@ambroise-arm tvm_vendor now includes the dlpack headers. You'll need to prefix the includes with tvm_vendor also. e.g. #include <tvm_vendor/dlpack/dlpack.h>.

ambroise-arm commented 3 years ago

@JWhitleyWork Tested on melodic.

I got to a working pipeline with the following steps:

as you said, prefix what needs to be with "tvm_vendor/"
add tvm_vendor as a dependency of tvm_utility
dmlc needs to be exposed in tvm_vendor, the same way dlpack and tvm/runtime are
the includes of dlpack (and probably dmlc) in tvm/runtime don't work without adding ${tvm_vendor_INCLUDE_DIRS}/tvm_vendor to the include_directories of tvm_utility. There might be a better way to fix that problem.
the CXX_STANDARD of the tvm_utility test packages needs to be changed from 11 to 14 because of TVM
pipeline.hpp needs to be updated. TVMArray * has to be replaced by TVMArrayHandle because of an API change from TVM 0.6 to 0.7
(that one took me longer to figure out) http://wiki.ros.org/catkin/CMakeLists.txt#catkin_package.28.29 says "[catkin_package()] must be called before declaring any targets with add_library()", which is not the case in tvm_vendor and causes the build to fail at link time

JWhitleyWork commented 3 years ago

@ambroise-arm Thanks for the detailed feedback! I've done the following to tvm_vendor:

Added commands to CMakeLists.txt to build and install libdmlc.a
Added commands to expose 3rdparty/dmlc-core/include/dmlc
Set the CXX_STANDARD to C++14 for both ROS1 and ROS2
Moved the call to catkin_package() before the add_library() calls

Please re-test and let me know if you run into anything else. Once you've verified that everything works as expected, I'll circle back around to this PR and make a PR to your repo for some required formatting changes in CMakeLists.txt and package.xml.

ambroise-arm commented 3 years ago

@JWhitleyWork tvm_vendor just needs:

   install(DIRECTORY ${SOURCE_DIR}/3rdparty/dmlc-core/include/dmlc
-    DESTINATION ${CATKIN_PACKAGE_INCLUDE_DESTINATION}/dmlc
+    DESTINATION ${CATKIN_PACKAGE_INCLUDE_DESTINATION}
   )

Otherwise it works fine.

JWhitleyWork commented 3 years ago

@ambroise-arm Thanks for the fix! Applied and tested locally for installation location. Additionally, I added the subfolder include/tvm_vendor to the repo and now have everything symlinked there when not using an install space in the ROS1 workspace. Please pull and check again to make sure everything still works in your configuration. After that, please update #13 with the dependency on tvm_vendor and I'll take a look at it. I'll also go ahead and add tvm_vendor to autoware.ai.repos and dependencies.repos in this repo.

JWhitleyWork commented 3 years ago

I'll also go ahead and add tvm_vendor to autoware.ai.repos and dependencies.repos in this repo.

Scratch that. It adds almost 30 minutes to the build which is not acceptable. We'll have to wait until tvm_vendor is released on the build farm (which I'll start right after you verify that it's good-to-go).

ambroise-arm commented 3 years ago

@JWhitleyWork It works!

ambroise-arm commented 3 years ago

13 has been updated with the `tvm_vendor` dependency and the TVM version change

JWhitleyWork commented 3 years ago

Excellent! I'll go ahead and get the release process started for Melodic. We'll work on Dashing and Foxy once we get to the Autoware.Auto example.

JWhitleyWork commented 3 years ago

https://github.com/ros/rosdistro/pull/27403

Last Melodic sync was 11/5 so it might be a couple of weeks before it gets into the release repo.

JWhitleyWork commented 3 years ago

@ambroise-arm @LiyouZhou @LucaFos Some quick updates:

There was a problem regarding versions of LLVM available in different Debian distributions that is being resolved by ros/rosdistro#27460. Once this is merged, I will need to update tvm_vendor and re-release to the build farm.
Regarding

(that one took me longer to figure out) http://wiki.ros.org/catkin/CMakeLists.txt#catkin_package.28.29 says "[catkin_package()] must be called before declaring any targets with add_library()", which is not the case in tvm_vendor and causes the build to fail at link time

Moving the add_library() calls to before catkin_package() causes an immediate failure in catkin_make when tvm_utility and tvm_vendor are in the same workspace. However, having catkin_package() before the add_library() calls works just fine. I'm wondering if this is diffferent because the added libraries are IMPORTED targets.

ambroise-arm commented 3 years ago

@JWhitleyWork I think you inverted the words "catkin_package" and "add_library" in your last message.

So far I had only tested building with colcon. My understanding is that catkin_make has been deprecated in 1.10 and support removed in 1.11. Why do we want to have it compile this package?

I am pretty sure that catkin_package() needs to be before add_library(). If the build failure with catkin_make has to be fixed, it needs to be done another way. Reverting to add_library() before catkin_package() allows tvm_utility to build with catkin_make, but catkin_make run_tests fails the same way I was seeing colcon build --packages-up-to tvm_utility fail (with a link-time error because the library isn't found).

And also about catkin_make, it seems tvm_utility was designed with colcon in mind. Even when built correctly, tvm_utility doesn't pass catkin_make run_tests because the files required for testing are not where expected.

autowarefoundation / autoware_ai_common

Implement a TVM inference pipeline #14

Feature request

Description

Implementation considerations

Additional information

13 has been updated with the `tvm_vendor` dependency and the TVM version change

autowarefoundation / autoware_ai_common

Implement a TVM inference pipeline #14

Feature request

Description

Implementation considerations

Additional information

13 has been updated with the tvm_vendor dependency and the TVM version change

13 has been updated with the `tvm_vendor` dependency and the TVM version change