pytorch / tvm

TVM integration into PyTorch
449 stars 64 forks source link

tests error #155

Closed MaxBareiss closed 4 years ago

MaxBareiss commented 4 years ago

I am using Ubuntu 16.04, pytorch 1.4.0, latest master version of pytorch-tvm, and it builds correctly. However, when running tests, it appears to fail to import this library:

running pytest
running egg_info
writing dependency_links to torch_tvm.egg-info/dependency_links.txt
writing torch_tvm.egg-info/PKG-INFO
writing top-level names to torch_tvm.egg-info/top_level.txt
reading manifest file 'torch_tvm.egg-info/SOURCES.txt'
writing manifest file 'torch_tvm.egg-info/SOURCES.txt'
running build_ext
running cmake_build
[  8%] Built target tvm_runtime
[ 81%] Built target tvm
[ 82%] Built target tvm_topi
[ 94%] Built target nnvm_compiler
[100%] Built target _torch_tvm
copying build/lib.linux-x86_64-3.5/torch_tvm/_torch_tvm.cpython-35m-x86_64-linux-gnu.so -> torch_tvm
=============================================================================================== test session starts ================================================================================================platform linux -- Python 3.5.2, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /home/username/git/torch-tvm, inifile: setup.cfg
collected 0 items / 3 errors

====================================================================================================== ERRORS ======================================================================================================________________________________________________________________________________________ ERROR collecting test/test_core.py ________________________________________________________________________________________ImportError while importing test module '/home/username/git/torch-tvm/test/test_core.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
test/test_core.py:2: in <module>
    from test.util import TVMTest
test/util.py:12: in <module>
    import torch_tvm
torch_tvm/__init__.py:9: in <module>
    from ._torch_tvm import *
E   ImportError: /home/username/git/torch-tvm/torch_tvm/_torch_tvm.cpython-35m-x86_64-linux-gnu.so: undefined symbol: _ZTIN3tvm4NodeE
_______________________________________________________________________________________ ERROR collecting test/test_models.py _______________________________________________________________________________________ImportError while importing test module '/home/username/git/torch-tvm/test/test_models.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
test/test_models.py:5: in <module>
    from skimage import io, transform
/usr/local/lib/python3.5/dist-packages/skimage/__init__.py:158: in <module>
    from .util.dtype import *
/usr/local/lib/python3.5/dist-packages/skimage/util/__init__.py:7: in <module>
    from .arraycrop import crop
/usr/local/lib/python3.5/dist-packages/skimage/util/arraycrop.py:8: in <module>
    from numpy.lib.arraypad import _validate_lengths
E   ImportError: cannot import name '_validate_lengths'
_____________________________________________________________________________________ ERROR collecting test/test_operators.py ______________________________________________________________________________________ImportError while importing test module '/home/username/git/torch-tvm/test/test_operators.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
test/test_operators.py:2: in <module>
    from test.util import TVMTest
test/util.py:12: in <module>
    import torch_tvm
torch_tvm/__init__.py:9: in <module>
    from ._torch_tvm import *
E   ImportError: /home/username/git/torch-tvm/torch_tvm/_torch_tvm.cpython-35m-x86_64-linux-gnu.so: undefined symbol: _ZTIN3tvm4NodeE
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!============================================================================================= 3 error in 0.39 seconds ============================================================================================

This is a different undefined symbol than other issues (#149, #93).

jjohnson-arm commented 4 years ago

That symbol is in libtvm.so (using readelf --syms). So it either can't find that lib, or it hasn't properly linked to it - as this is what happened in #149 for libc10.so and other libraries. Has your target_link_libraries line inCMakeLists.txt got tvm in it? (See my suggested patch in #149)

MaxBareiss commented 4 years ago

My pytorch did not have libtorch_cpu.so or libtorch_cuda.so (not sure why), so below is the patch I used:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index a6108e6..b656518 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -32,7 +32,7 @@ add_subdirectory(${TVM_DIR})

 pybind11_add_module(_torch_tvm SHARED ${TORCH_TVM_SRCS})
 target_link_libraries(_torch_tvm PUBLIC
-  torch pybind11 tvm tvm_topi)
+  torch c10 torch_python pybind11 tvm tvm_topi)

 target_include_directories(_torch_tvm PUBLIC
     ${CMAKE_CURRENT_SOURCE_DIR}

This fix was included in my first request (I forgot to mention it) and it did not fix the problem. The following commands have the following outputs:

$ readelf -Ws build/tvm/libtvm.so | grep _ZTIN3tvm4NodeE
  7483: 00000000045e81d8    40 OBJECT  WEAK   DEFAULT   21 _ZTIN3tvm4NodeE
 49707: 00000000045e81d8    40 OBJECT  WEAK   DEFAULT   21 _ZTIN3tvm4NodeE

$ readelf -Ws build/_torch_tvm.cpython-35m-x86_64-linux-gnu.so | grep _ZTIN3tvm4NodeE
   315: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  UND _ZTIN3tvm4NodeE
jjohnson-arm commented 4 years ago

To double-check the linking is working you can try:

$ objdump -p _torch_tvm.cpython-36m-x86_64-linux-gnu.so | grep NEEDED
  NEEDED               libc10.so
  NEEDED               libtorch_cpu.so
  NEEDED               libtorch_python.so
  NEEDED               libtvm.so
  NEEDED               libLLVM-6.0.so.1
  NEEDED               libstdc++.so.6
  NEEDED               libm.so.6
  NEEDED               libgcc_s.so.1
  NEEDED               libc.so.6
  NEEDED               ld-linux-x86-64.so.2

And you should see a similar list including libtvm.so. If this is the case then for some reason the pytests can't find the library. I am using a python 3.6 virtual environment when building (though I did have some success with an anaconda environment, though haven't tried it recently). Using strace python setup.py test the log output shows it found libtvm.so in the python environment here: py3.6env-tvm/lib/python3.6/site-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/libtvm.so So maybe the python setup.py install step hasn't worked?

MaxBareiss commented 4 years ago

This issue was caused because my installed tvm was v0.6.0 from master, but this repo requires that the included facebookexperimental version be installed. Uninstalling the upstream tvm and following the tvm install instructions on the included tvm solves this issue.