Closed muni-corn closed 8 months ago
sorry; this may be unrelated to nix. i was able to get a backtrace and it seems related to AMD and HIP:
#0 0x00007fffa5417085 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#1 0x00007fffa541ae47 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#2 0x00007fffa5427ce9 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#3 0x00007fffa53a6009 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#4 0x00007fffa53a61a0 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#5 0x00007fffa528c6fe in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#6 0x00007fffa5313941 in hipMemcpyWithStream () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#7 0x00007fffa6f632ba in c10::hip::memcpy_and_sync(void*, void*, long, hipMemcpyKind, ihipStream_t*) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_hip.so
#8 0x00007fffa6f4f749 in at::native::copy_kernel_cuda(at::TensorIterator&, bool) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_hip.so
#9 0x00007fffcddcd6ea in at::native::copy_impl(at::Tensor&, at::Tensor const&, bool) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#10 0x00007fffcddcea61 in at::native::copy_(at::Tensor&, at::Tensor const&, bool) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#11 0x00007fffce8f7896 in at::_ops::copy_::call(at::Tensor&, at::Tensor const&, bool) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#12 0x00007fffce0b5919 in at::native::_to_copy(at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) ()
from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#13 0x00007fffcec0497a in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>), &at::(anonymous namespace)::(anonymous namespace)::wrapper___to_copy>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat> > >, at::Tensor (at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#14 0x00007fffce50572d in at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
Yeah the AMD stuff is very buggy, and there's nothing we can do about that, since we only build that code, we do not write or patch that code. If there are any patches we can apply, let me know, and we can do that.
hi! i'm trying to get invokeai running on my setup but i'm running into an address boundary error.
i have an AMD RX 7600 GPU (gfx1102). let me know what other information i can provide to help!