Open zou3519 opened 2 years ago
Looks like a linker compatibility problem (i.e. when one c++ runtime does not know how to talk to another one or how to parse unwind instructions)
Though it works for me on Ubuntu-18.04, by running the following commands:
$ conda create -n py38-torch112-cpu python=3.8
$ conda activate py38-torch112-cpu
$ python3 -mpip install --pre torch==1.12 -f https://download.pytorch.org/whl/test/cpu/torch_test.html
$ pip install functorch-0.2.0-cp38-cp38-linux_x86_64.whl
$ python
Python 3.8.13 (default, Mar 28 2022, 11:38:47)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
from f>>> from functorch import vmap
>>> x=torch.rand(2, 3, 5)
<stdin>:1: UserWarning: Failed to initialize NumPy: numpy.core.multiarray failed to import (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:68.)
>>> vmap(lambda x: x, out_dims=3)(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/fsx/users/nshulga/conda/envs/py38-torch112-cpu/lib/python3.8/site-packages/functorch/_src/vmap.py", line 366, in wrapped
return _unwrap_batched(batched_outputs, out_dims, vmap_level, batch_size, func)
File "/fsx/users/nshulga/conda/envs/py38-torch112-cpu/lib/python3.8/site-packages/functorch/_src/vmap.py", line 165, in _unwrap_batched
flat_outputs = [
File "/fsx/users/nshulga/conda/envs/py38-torch112-cpu/lib/python3.8/site-packages/functorch/_src/vmap.py", line 166, in <listcomp>
_remove_batch_dim(batched_output, vmap_level, batch_size, out_dim)
IndexError: Dimension out of range (expected to be in range of [-3, 2], but got 3)
There's also a related problem (which was discussed offline): the PyTorch cu102 binaries don't include the _ZNSt19basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
symbol, but the PyTorch cpu/cu113/cu116 binaries do. On some systems, libstdc++.so.6 doesn't actually include this, so this leads to a "symbol missing error" on import functorch
Just to clarify:
$ c++filt _ZNSt19basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()
This only happens on one of my machines. It does not happen in our CI machines. Could just be a me-problem.
Repro:
Produces:
I would expect the error message to look like the following: