Open Mousius opened 2 years ago
When enabling PyTorch and ONNX, I spotted a few more instance of these libgomp
relates issues, so I'm adding new tests to the list of skipped tests in AArch64, for further investigation, but in the meanwhile, we guarantee that the other don't regress.
The error message looks like this:
xgboost.core.XGBoostError: XGBoost Library (libxgboost.so) could not be loaded.
Likely causes:
* OpenMP runtime is not installed (vcomp140.dll or libgomp-1.dll for Windows, libomp.dylib for Mac OSX, libgomp.so for Linux and other UNIX-like OSes). Mac OSX users: Run `brew install libomp` to install OpenMP runtime.
* You are running 32-bit Python on a 64-bit OS
Error message(s): ['/usr/local/lib/python3.7/dist-packages/xgboost/lib/../../xgboost.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block']
Or another version:
def test_guess_frontend_pytorch():
# some CI environments wont offer pytorch, so skip in case it is not present
> pytest.importorskip("torch")
tests/python/driver/tvmc/test_frontends.py:79:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.7/dist-packages/torch/__init__.py:198: in <module>
_load_global_deps()
/usr/local/lib/python3.7/dist-packages/torch/__init__.py:151: in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <CDLL '/usr/local/lib/python3.7/dist-packages/torch/lib/libtorch_global_deps.so', handle 0 at 0xffff1c3e0bd0>
name = '/usr/local/lib/python3.7/dist-packages/torch/lib/libtorch_global_deps.so'
mode = 256, handle = None, use_errno = False, use_last_error = False
def __init__(self, name, mode=DEFAULT_MODE, handle=None,
use_errno=False,
use_last_error=False):
self._name = name
flags = self._func_flags_
if use_errno:
flags |= _FUNCFLAG_USE_ERRNO
if use_last_error:
flags |= _FUNCFLAG_USE_LASTERROR
if _sys.platform.startswith("aix"):
"""When the name contains ".a(" and ends with ")",
e.g., "libFOO.a(libFOO.so)" - this is taken to be an
archive(member) syntax for dlopen(), and the mode is adjusted.
Otherwise, name is presented to dlopen() as a file argument.
"""
if name and name.endswith(")") and ".a(" in name:
mode |= ( _os.RTLD_MEMBER | _os.RTLD_NOW )
class _FuncPtr(_CFuncPtr):
_flags_ = flags
_restype_ = self._func_restype_
self._FuncPtr = _FuncPtr
if handle is None:
> self._handle = _dlopen(self._name, mode)
E OSError: /usr/local/lib/python3.7/dist-packages/torch/lib/libgomp-d22c30c5.so.1: cannot allocate memory in static TLS block
/usr/lib/python3.7/ctypes/__init__.py:364: OSError
In the process of investigating these, I realised that no environment with torch is running integration tests (see https://github.com/apache/tvm/issues/12529), which is also reason for concern that should be fixed.
Just submitted https://github.com/apache/tvm/pull/12554 with the new tests that need skipping, now that I'm testing the environments with Torch installed.
This issue is to track progress enabling tests on AArch64
As part of enabling more tests in the AArch64 container, a number of tests had to be skipped and need to be fixed.
See also: https://github.com/apache/tvm/pull/10677 / https://github.com/apache/tvm/pull/10564
Potential Schedule Issues
xgboost issues
Unsure