Closed gehring closed 3 years ago
I ended up poking around a bit more and I'm reasonably certain this is a bug and not just some uninformative error. This issue seems to be dependent on the chunk_length
. It seems like only chunk_length = 3
leads to a crash. Additionally, changing the order of data also affects this issue. Placing the larger array first will not lead to crash while any other ordering does. Hopefully this info is somewhat helpful!
Agreed; this looks like a real bug.
I ran a backtrace on this failure:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7c3e537 in __GI_abort () at abort.c:79
#2 0x00007fffc04a0304 in tensorflow::internal::LogMessageFatal::~LogMessageFatal() () from /home/ebrevdo/.local/lib/python3.8/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#3 0x00007fffb121353a in tensorflow::tensor::Concat(absl::lts_2020_02_25::Span<tensorflow::Tensor const> const&, tensorflow::Tensor*) ()
from /home/ebrevdo/.local/lib/python3.8/site-packages/tensorflow/python/../libtensorflow_framework.so.2
#4 0x00007fff8d5703e1 in deepmind::reverb::Writer::Finish(bool) () from /home/ebrevdo/.local/lib/python3.8/site-packages/reverb/libpybind.so
#5 0x00007fff8d56fc13 in deepmind::reverb::Writer::Append(std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >) () from /home/ebrevdo/.local/lib/python3.8/site-packages/reverb/libpybind.so
#6 0x00007fff8d528c7f in pybind11::cpp_function::cpp_function<absl::lts_2020_02_25::Status, deepmind::reverb::Writer, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::call_guard<pybind11::gil_scoped_release> >(absl::lts_2020_02_25::Status (deepmind::reverb::Writer::*)(std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::call_guard<pybind11::gil_scoped_release> const&)::{lambda(deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >)#1}::operator()(deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >) const () from /home/ebrevdo/.local/lib/python3.8/site-packages/reverb/libpybind.so
#7 0x00007fff8d528b40 in absl::lts_2020_02_25::Status pybind11::detail::argument_loader<deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> > >::call_impl<absl::lts_2020_02_25::Status, pybind11::cpp_function::cpp_function<absl::lts_2020_02_25::Status, deepmind::reverb::Writer, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::call_guard<pybind11::gil_scoped_release> >(absl::lts_2020_02_25::Status (deepmind::reverb::Writer::*)(std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::call_guard<pybind11::gil_scoped_release> const&)::{lambda(deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >)#1}&, 0ul, 1ul, pybind11::gil_scoped_release>(pybind11::cpp_function::cpp_function<absl::lts_2020_02_25::Status, deepmind::reverb::Writer, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::call_guard<pybind11::gil_scoped_release> >(absl::lts_2020_02_25::Status (deepmind::reverb::Writer::*)(std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::call_guard<pybind11::gil_scoped_release> const&)::{lambda(deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >)#1}&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::gil_scoped_release&&) ()
from /home/ebrevdo/.local/lib/python3.8/site-packages/reverb/libpybind.so
#8 0x00007fff8d525af7 in pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<absl::lts_2020_02_25::Status, deepmind::reverb::Writer, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::call_guard<pybind11::gil_scoped_release> >(absl::lts_2020_02_25::Status (deepmind::reverb::Writer::*)(std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::call_guard<pybind11::gil_scoped_release> const&)::{lambda(deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >)#1}, absl::lts_2020_02_25::Status, deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::call_guard<pybind11::gil_scoped_release> >(pybind11::cpp_function::initialize<absl::lts_2020_02_25::Status, deepmind::reverb::Writer, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::call_guard<pybind11::gil_scoped_release> >(absl::lts_2020_02_25::Status (deepmind::reverb::Writer::*)(std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::call_guard<pybind11::gil_scoped_release> const&)::{lambda(deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >)#1}&&, absl::lts_2020_02_25::Status (*)(deepmind::reverb::Writer*, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::call_guard<pybind11::gil_scoped_release> const&)::{lambda(pybind11::detail::function_call&)#1}::operator()(pybind11::detail::function_call&) const () from /home/ebrevdo/.local/lib/python3.8/site-packages/reverb/libpybind.so
#9 0x00007fff8d517408 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) () from /home/ebrevdo/.local/lib/python3.8/site-packages/reverb/libpybind.so
This comes from Writer::Finish
calling tf::tensor::Concat
without doing some proper validation.
Reverb will crash and raise signal 6
SIGABRT
when writing arrays with varying sizes. Here is a short notebook for reproducing this issue. I've reproduced it on a standard colab runtime and on my local system, on both version 0.1.0 and nightly.Here is core code and resulting error I get when running on my local system (this kills the colab runtime so it provides limit feedback there).
I'm not sure if this use case is meant to be supported by reverb, but, if not, some more informative errors might be warranted. Is there any documentation that covers reverb's assumptions with regards to the shape and dtype of data being written to a given table (if there are any)?