Describe the bug:
I am trying to build torchdistx from source following the instructions in the readme. Basically, I am running -
pip install --upgrade -r requirements.txt -r use-cpu.txt
cmake -DTORCHDIST_INSTALL_STANDALONE=ON -B build
cmake --build build # <- This errors out
When running cmake --build build, I see the following error -
[ 12%] Building CXX object src/cc/torchdistx/CMakeFiles/torchdistx.dir/deferred_init.cc.o
[ 25%] Building CXX object src/cc/torchdistx/CMakeFiles/torchdistx.dir/fake.cc.o
[ 37%] Building CXX object src/cc/torchdistx/CMakeFiles/torchdistx.dir/stack_utils.cc.o
[ 50%] Linking CXX shared library libtorchdistx.so
[ 50%] Built target torchdistx
[ 62%] Building CXX object src/python/torchdistx/_C/CMakeFiles/torchdistx-py.dir/deferred_init.cc.o
/home/ubuntu/repos/torchdistx/src/python/torchdistx/_C/deferred_init.cc:24:14: error: ‘torch::TypeError’ has not been declared
using torch::TypeError;
^~~~~~~~~
/home/ubuntu/repos/torchdistx/src/python/torchdistx/_C/deferred_init.cc: In function ‘pybind11::object torchdistx::python::{anonymous}::materializeVariable(const pybind11::object&)’:
/home/ubuntu/repos/torchdistx/src/python/torchdistx/_C/deferred_init.cc:64:11: error: ‘TypeError’ was not declared in this scope
throw TypeError{"`var` has to be a `Variable`, but got `%s`.", Py_TYPE(naked_var)->tp_name};
^~~~~~~~~
/home/ubuntu/repos/torchdistx/src/python/torchdistx/_C/deferred_init.cc:64:11: note: suggested alternatives:
In file included from /opt/conda/envs/alpa/lib/python3.9/site-packages/torch/include/c10/core/Device.h:5:0,
from /opt/conda/envs/alpa/lib/python3.9/site-packages/torch/include/ATen/core/TensorBody.h:11,
from /opt/conda/envs/alpa/lib/python3.9/site-packages/torch/include/ATen/core/Tensor.h:3,
from /opt/conda/envs/alpa/lib/python3.9/site-packages/torch/include/ATen/Tensor.h:3,
from /home/ubuntu/repos/torchdistx/src/python/torchdistx/_C/deferred_init.cc:9:
/opt/conda/envs/alpa/lib/python3.9/site-packages/torch/include/c10/util/Exception.h:246:15: note: ‘c10::TypeError’
class C10_API TypeError : public Error {
^~~~~~~~~
/opt/conda/envs/alpa/lib/python3.9/site-packages/torch/include/c10/util/Exception.h:246:15: note: ‘c10::TypeError’
/opt/conda/envs/alpa/lib/python3.9/site-packages/torch/include/c10/util/Exception.h:246:15: note: ‘c10::TypeError’
make[2]: *** [src/python/torchdistx/_C/CMakeFiles/torchdistx-py.dir/build.make:76: src/python/torchdistx/_C/CMakeFiles/torchdistx-py.dir/deferred_init.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:914: src/python/torchdistx/_C/CMakeFiles/torchdistx-py.dir/all] Error 2
make: *** [Makefile:146: all] Error 2
Please let me know if I am doing something silly here or if torchdistx is not meant to support newer versions of PT? (And if so, is there another way to use the deferred_init or fake_tensor APIs in PyTorch?).
Describe how to reproduce:
pip install --upgrade -r requirements.txt -r use-cpu.txt
cmake -DTORCHDIST_INSTALL_STANDALONE=ON -B build
cmake --build build # <- This errors out
Environment:
OS: Ubuntu 20.04
main branch of torchdistx
Additional context:
The build works for PT 1.12 and PT 1.13 but not with PT 2.0. I am trying to get Alpa to work for PT2.0 and it uses torchdistx. Right now, Alpa works with PT1.12 and PT1.13 (with a minor change) but not PT2.0.
Hi!
Describe the bug: I am trying to build
torchdistx
from source following the instructions in the readme. Basically, I am running -When running
cmake --build build
, I see the following error -Please let me know if I am doing something silly here or if
torchdistx
is not meant to support newer versions of PT? (And if so, is there another way to use thedeferred_init
orfake_tensor
APIs in PyTorch?).Describe how to reproduce:
Environment:
Additional context: The build works for PT 1.12 and PT 1.13 but not with PT 2.0. I am trying to get Alpa to work for PT2.0 and it uses
torchdistx
. Right now, Alpa works with PT1.12 and PT1.13 (with a minor change) but not PT2.0.