Closed jwbonner closed 1 year ago
I'm not able to duplicate your error using that example code. Do you have a more complete example?
This is the specific place we're trying to publish an array, though all of the other test programs we've tried have exhibited the same issue:
I'm still not able to reproduce even when tweaking your example. Does it happen immediately or is there something that needs to trigger it? Or maybe it only happens on raspbian... I'm testing on Linux.
No, it happens whenever we call set
. All of our testing has been with the server connected. Also a correction, this is running on an Orange Pi with Ubuntu, so the Linux arm64 build.
Yeah, it doesn't happen for me when I call set, so there must be something else to this.
Could it be the wrong Python version? I can double check what we were running (I don't have access to the device right now), but I think it was 3.10. The same issue happened when running on a Le Potato with Armbian (I can check versions on that too if it's useful).
Are there any other logs or tests that would be useful to debug this? For example, I could do a verbose pip install and see if it was doing anything unusual (rebuilding things it shouldn't or something like that).
I think if you could identify what is throwing the bad_alloc that would be good. Might be some uninitialized variable on the C++ side. This might help you do that with gdb: https://stackoverflow.com/questions/6835728/how-to-break-when-a-specific-exception-type-is-thrown-in-gdb
The Python version is 3.10.6. I don't know much about gdb, but here's the backtrace from the exception:
#0 0x0000007ff66d2cdc in __cxa_throw () from /lib/aarch64-linux-gnu/libstdc++.so.6
#1 0x0000007ff66d3318 in operator new(unsigned long) () from /lib/aarch64-linux-gnu/libstdc++.so.6
#2 0x0000007ff61aab60 in nt::Value::MakeDoubleArray(std::span<double const, 18446744073709551615ul>, long) ()
from /home/orangepi/.local/lib/python3.10/site-packages/ntcore/lib/libntcore.so
#3 0x0000007ff61cdb44 in nt::SetDoubleArray(unsigned int, std::span<double const, 18446744073709551615ul>, long) ()
from /home/orangepi/.local/lib/python3.10/site-packages/ntcore/lib/libntcore.so
#4 0x0000007ff5ee8b10 in pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<void, nt::DoubleArrayPublisher, std::span<double const, 18446744073709551615ul>, long, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v, pybind11::call_guard<pybind11::gil_scoped_release>, pybind11::doc>(void (nt::DoubleArrayPublisher::*)(std::span<double const, 18446744073709551615ul>, long), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&, pybind11::call_guard<pybind11::gil_scoped_release> const&, pybind11::doc const&)::{lambda(nt::DoubleArrayPublisher*, std::span<double const, 18446744073709551615ul>, long)#1}, void, nt::DoubleArrayPublisher*, std::span<double const, 18446744073709551615ul>, long, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v, pybind11::call_guard<pybind11::gil_scoped_release>, pybind11::doc>(pybind11::cpp_function::initialize<void, nt::DoubleArrayPublisher, std::span<double const, 18446744073709551615ul>, long, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v, pybind11::call_guard<pybind11::gil_scoped_release>, pybind11::doc>(void (nt::DoubleArrayPublisher::*)(std::span<double const, 18446744073709551615ul>, long), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&, pybind11::call_guard<pybind11::gil_scoped_release> const&, pybind11::doc const&)::{lambda(nt::DoubleArrayPublisher*, std::span<double const, 18446744073709551615ul>, long)#1}&&, void (*)(nt::DoubleArrayPublisher*, std::span<double const, 18446744073709551615ul>, long), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&, pybind11::call_guard<pybind11::gil_scoped_release> const&, pybind11::doc const&)::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const [clone .isra.0] () from /home/orangepi/.local/lib/python3.10/site-packages/ntcore/_ntcore.cpython-310-aarch64-linux-gnu.so
#5 0x0000007ff5ec14c0 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) ()
from /home/orangepi/.local/lib/python3.10/site-packages/ntcore/_ntcore.cpython-310-aarch64-linux-gnu.so
#6 0x00000055556536f4 in ?? ()
#7 0x000000555564a0a0 in _PyObject_MakeTpCall ()
#8 0x0000005555662f9c in ?? ()
#9 0x0000005555640bf0 in _PyEval_EvalFrameDefault ()
#10 0x0000005555739a80 in ?? ()
#11 0x0000005555739904 in PyEval_EvalCode ()
#12 0x000000555576f1ec in ?? ()
#13 0x00000055557668d8 in ?? ()
#14 0x000000555576ee9c in ?? ()
#15 0x000000555576e004 in _PyRun_SimpleFileObject ()
#16 0x000000555576dca4 in _PyRun_AnyFileObject ()
#17 0x000000555575c9b0 in Py_RunMain ()
#18 0x000000555572ab08 in Py_BytesMain ()
#19 0x0000007ff7d173fc in ?? () from /lib/aarch64-linux-gnu/libc.so.6
#20 0x0000007ff7d174cc in __libc_start_main () from /lib/aarch64-linux-gnu/libc.so.6
#21 0x000000555572a9f0 in _start ()
That's a good start! There's no smoking gun yet, but I'll dig a bit more.
Good news! I upgraded my odroid-c2 to Ubuntu 22.04 and was able to duplicate your error. Won't have time to dig into it until maybe very late tonight.
So Peter found out that GCC changed the ABI for std::span between GCC 10 and 11. My ubuntu 22.04 has 11, and I'm guessing yours does too -- but the wpilib artifacts are built with 10.
Currently we only publish python 3.9 aarch64 artifacts, I'm going to roll out 3.8-3.11 artifacts as well.
I'm heading to bed, but I've started the deploy process. Hopefully by the end of the night there should be some aarch64 artifacts at https://tortall.net/~robotpy/wheels/2023/raspbian/. You should be able to:
pip install --find-links https://tortall.net/~robotpy/wheels/2023/raspbian/ pyntcore==2023.2.1.2
And that will install a python 3.10 wheel, and it shouldn't have the problem you reported. I haven't built the wheel locally, so I haven't tested this either... but it seems pretty likely that this will fix it.
Tried it on my odroid, that fixed it.
Problem description
When setting an array value (int[], float[], double[], boolean[], string[], or raw) using a publisher or entry a
MemoryError: std::bad_alloc
is thrown. This does not occur for other types. Only tested on Raspbian.Operating System
Raspbian
Installed Python Packages
Reproducible example code