iRobotEducation / irobot-edu-python-sdk

Python SDK for iRobot Edu robots (Root or Create 3)
BSD 3-Clause "New" or "Revised" License
16 stars 6 forks source link

Bluetooth very inconsistent in making connections #24

Closed shamlian closed 11 months ago

shamlian commented 12 months ago

Discussed in https://github.com/iRobotEducation/create3_docs/discussions/435

Originally posted by **tribelhb** August 22, 2023 ### How are you connecting to your Create 3? Bluetooth (Python SDK) ### Computer(s) Model(s) and Operating System(s) M1 MacBook Pro Ventura 13.1 ### Which version of ROS 2 is installed on your computer? None ### Which firmware version is installed on your robot? G.5.3 ### Which RMW is your robot running? I don't know ### What is the Adapter Board's USB/BLE Toggle currently switched to? Bluetooth (default) ### Describe your question. I renamed the bluetooth interface via the wifi hotspot to iRobot107 as I need to be able to specify which robot to connect to. Using the python SDK square example (16) and the play frequencies (19) I am occasionally able to get the code to connect, but it seems to get stuck on the disconnect. The SDK examples are very simple, so there seems to be a major issue in the reliability of these connections. I updated python from 3.9 to 3.11 and no effect. Any help would be appreciated. I also tried plugging into the USB port and switching over to that form, but no serial device appeared under /dev Has anyone had better luck connecting over USB on linux? I will be mounting a raspberry pi for camera streaming, so it can easily be plugged into that port...
shamlian commented 12 months ago

comments didn't migrate so I'll put them here: me:

Hi! Could you tell me a few more things about how you are using the robot? I think there might be some confusion about how the interfaces work. If you could provide a terminal log of what you're trying to do when you connect to BLE, and the error messages you get, that could be helpful. To be clear, are you running Linux on a M1 MacBook Pro? As far as plugging the robot directly into your Mac, that will likely not work, as the robot is a USB Host, and likely your computer is, as well. The USB-C connection on the robot not a tty serial device.

Have you tried python.irobot.com? That maintains a continuous BLE connection, as opposed to attempting to disconnect and reconnect from example to example; knowing that would help me to help you debug this.

shamlian commented 12 months ago

tribelhb

Hello thanks for the speedy reply. I have tried connecting from MacOS on the M1 Mac. Running python from the terminal to connect Bluetooth. I did also try the usb-C connection but you are correct it doesn’t appear.

Separately I tried on a raspberry pi 4 running Ubuntu 23. It gave a clearer error log when running the python code from the terminal (attached). (And tried the USB-C route as well to no effect).

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/robot.py", line 282, in play
    self._loop.run_until_complete(self._main())
  File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/robot.py", line 156, in _main
    await self._backend.connect()
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/backend/bluetooth_desktop.py", line 43, in connect
    devices = await BleakScanner.discover(service_uuids=[self.ROOT_ID_SERVICE, self.UART_SERVICE])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/bleak/__init__.py", line 216, in discover
    async with cls(**kwargs) as scanner:
  File "/usr/local/lib/python3.11/dist-packages/bleak/__init__.py", line 126, in __aenter__
    await self._backend.start()
  File "/usr/local/lib/python3.11/dist-packages/bleak/backends/bluezdbus/scanner.py", line 191, in start
    self._stop = await manager.active_scan(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/bleak/backends/bluezdbus/manager.py", line 368, in active_scan
    assert_reply(reply)
  File "/usr/local/lib/python3.11/dist-packages/bleak/backends/bluezdbus/utils.py", line 20, in assert_reply
    raise BleakDBusError(reply.error_name, reply.body)
bleak.exc.BleakDBusError: [org.bluez.Error.InProgress] Operation already in progress

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pi/demo.py", line 18, in <module>
    robot.play()
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/robot.py", line 295, in play
    self._loop.run_until_complete(self._finished())
  File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/robot.py", line 107, in _finished
    await self._backend.disconnect()
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/backend/bluetooth_desktop.py", line 66, in disconnect
    await self._client.disconnect()
          ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'disconnect'
***@***.***:~$ python3 demo.py
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/robot.py", line 282, in play
    self._loop.run_until_complete(self._main())
  File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/robot.py", line 156, in _main
    await self._backend.connect()
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/backend/bluetooth_desktop.py", line 43, in connect
    devices = await BleakScanner.discover(service_uuids=[self.ROOT_ID_SERVICE, self.UART_SERVICE])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/bleak/__init__.py", line 216, in discover
    async with cls(**kwargs) as scanner:
  File "/usr/local/lib/python3.11/dist-packages/bleak/__init__.py", line 126, in __aenter__
    await self._backend.start()
  File "/usr/local/lib/python3.11/dist-packages/bleak/backends/bluezdbus/scanner.py", line 191, in start
    self._stop = await manager.active_scan(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/bleak/backends/bluezdbus/manager.py", line 368, in active_scan
    assert_reply(reply)
  File "/usr/local/lib/python3.11/dist-packages/bleak/backends/bluezdbus/utils.py", line 20, in assert_reply
    raise BleakDBusError(reply.error_name, reply.body)
bleak.exc.BleakDBusError: [org.bluez.Error.InProgress] Operation already in progress

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pi/demo.py", line 18, in <module>
    robot.play()
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/robot.py", line 295, in play
    self._loop.run_until_complete(self._finished())
  File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/robot.py", line 107, in _finished
    await self._backend.disconnect()
  File "/usr/local/lib/python3.11/dist-packages/irobot_edu_sdk/backend/bluetooth_desktop.py", line 66, in disconnect
    await self._client.disconnect()
          ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'disconnect'

Running from the web browser is probably not a possible solution because we are going to be running OpenCV and need the robot state machine code to be in the same code base. (Although maybe a socket connection to the browser is possible? I haven’t done that before).

shamlian commented 12 months ago

I was asking about whether or not you tried the browser-based Python because I am trying to narrow down whether the problem is about OS disconnect or something else. In particular, I am concerned you may be running into #1 or #2. The difference between running in the browser and running in the shell is that in the browser, the connection is maintained between running different programs, while on the shell, it has to be closed and then reopened. Reconnecting in Linux, there is a known problem documented as #3 (but I had not seen it in OS X). What do things look like in OS X? As far as solving these issues, we have a potential patch that needs testing; if you are game to install a branch from source, we could have you try it. Let me know.

tribelhb commented 12 months ago

I will have to try and see if I can connect via the browser. On Mac OS the python code will hang without any errors, and then a similar error message about None Type not having a .disconnect method only appears after a Ctl-C, but it doesn't trace the code as well as it did on the Raspberry Pi. In the cases where it does connect it has dropped the connection while commands are being sent--for example the play tones example played about 8 times before going silent and hanging. Would be very willing to try a patch from source--I won't be able to do so until Monday however.

shamlian commented 12 months ago

Running G.5.2 (you said you were running G.5.3 but we haven't released such a version, yet), I cannot duplicate this problem on either Windows 10 or in an M1 Mac running Monterrey (12.6.7). I don't have a Mac running Ventura. I can say that I am running into a different error on my Ubuntu machine which is running BlueZ 5.53; what version of BlueZ do you have (type bluetoothd -v or maybe bluetoothctl version)? I am concerned that this issue is going to be a somewhat gnarly OS-implementation-specific one, but we'll keep investigating.

shamlian commented 12 months ago

I have good news, and I have bad news. The bad news is that it seems there was a bug in BlueZ (clue was in https://github.com/hbldh/bleak/issues/1176#issuecomment-1360725995) which is causing the robot to not be able to connect. The good news is that it is fixed. I updated my laptop to BlueZ 5.66 and all is well. This is a little bit of work but it's not too bad. I did the following. YMMV and I apologize in advance if this screws up the Bluetooth GUI:

# zeroth, remove the current version of BlueZ
sudo apt autoremove bluez

# first, install dependencies
sudo apt install build-essential libreadline-dev libical-dev libdbus-1-dev libudev-dev libglib2.0-dev python3-docutils

# then, fetch, build, and install the newest BlueZ
wget http://www.kernel.org/pub/linux/bluetooth/bluez-5.66.tar.xz
tar xvf bluez-5.66.tar.xz
cd bluez-5.66
./configure
make
sudo make install
sudo systemctl daemon-reload
sudo systemctl unmask bluetooth.service
sudo systemctl restart bluetooth

As far as disconnection, I added a new branch, shamlian/catch_and_kill which is (admittedly) a bit of a hammer, but it also seems to solve #3. Feel free to give it a shot and let me know if it helps! The PR for this patch is #25 ; happy to have your review.

tribelhb commented 11 months ago

Thank you for this. Updating BlueZ to 5.66 makes it so that after the program is exited forcefully using Ctl-C, the robot does make a sound to indicate that the disconnection was successful. It's not clear why the Ctl-C is required. I did get the error below on one run, despite that single trace now I can connect repeatedly so long as I give it the keyboard interrupt.

(Also I can confirm the firmware was a type-o and it is G.5.2)

Connecting to iRobot108 (00:16:A4:D2:1D:AE) Traceback (most recent call last): File "/home/pi/irobot-edu-python-sdk-shamlian-catch_and_kill/demo.py", line 18, in robot.play() File "/home/pi/irobot-edu-python-sdk-shamlian-catch_and_kill/irobot_edu_sdk/robot.py", line 287, in play self._loop.run_until_complete(self._main()) File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/home/pi/irobot-edu-python-sdk-shamlian-catch_and_kill/irobot_edu_sdk/robot.py", line 161, in _main await self._backend.connect() File "/home/pi/irobot-edu-python-sdk-shamlian-catch_and_kill/irobot_edu_sdk/backend/bluetooth_desktop.py", line 59, in connect if await self._client.connect(): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/bleak/init.py", line 471, in connect return await self._backend.connect(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/bleak/backends/bluezdbus/client.py", line 190, in connect assert_reply(reply) File "/usr/local/lib/python3.11/dist-packages/bleak/backends/bluezdbus/utils.py", line 20, in assert_reply raise BleakDBusError(reply.error_name, reply.body) bleak.exc.BleakDBusError: [org.bluez.Error.Failed] le-connection-abort-by-local

shamlian commented 11 months ago

Regarding "it's not clear why the Ctl-C is required:" to quote from the README on this repo, "This SDK uses a similar format as Learning Level 3 of the iRobot® Coding App (code.irobot.com) and the iRobot® Python Web Playground (python.irobot.com)." The learn-to-code apps maintain a continuous connection with the robots, and use the "play" and "stop" buttons to start and stop code execution. The downloadable SDK doesn't have a separate daemon to keep the connection with the robot alive, so it necessarily has to start and stop that connection at the beginning and end of execution. However, the model of the iRobot Coding app does not have a programmatic way to stop execution within itself, so the program has to be killed with a ^C (as if one were pressing the "stop" button), since all tasks are concurrent and don't know anything about each others' completion. We have discussed adding a stop block; there is a secret (shh, don't tell anyone) stop_program method in utils that might do what you want -- just from .utils import stop_program at the top of your program.

shamlian commented 11 months ago

Do you need any more help, or can I close this ticket?

tribelhb commented 11 months ago

Thanks for the hint--definitely need that ability to send a disconnect command without hitting ctl-c! I just wanted you to be able to close out with a little more diagnostic info. I did extensive testing and I have it connecting successfully on Ubuntu 23 and BlueZ 5.66 (although there does seem to be a delay after disconnecting before it will reconnect of many seconds). It does not work consistently on Ubuntu 22 LTS/BlueZ 5.64 nor on RP OS/BlueZ 5.55. It is also sporadic on an M1/Mac OS 13.1.

shamlian commented 11 months ago

There is also a disconnect() method that is better in 0.4.0 that we plan to release soon. You can check out the pre_0.4.0 branch for the fix.

shamlian commented 11 months ago

The BlueZ 30s delay feel free to put in an issue for; I have also experienced it and I think I have an idea of why it is happening.

shamlian commented 11 months ago

There is a bug in CoreBluetooth on macOS 13.0 which was not fixed until macOS 13.3. We've done everything we can to work around it, but unfortunately that's something that's a little bit broken in the OS. More info here.

tribelhb commented 11 months ago

Thank you on all counts! I'll make a new thread if anything else crops up as we use it this fall.