Open risa2000 opened 5 years ago
Just this evening I found a possible explanation for the main UART dropping bytes, even with flow control enabled. Assuming the fix works, I'd expect it resolve at least some of the issues people are seeing.
Just this evening I found a possible explanation for the main UART dropping bytes, even with flow control enabled. Assuming the fix works, I'd expect it resolve at least some of the issues people are seeing.
cool, looking forward to seeing the pull request :)
A patch has just gone into rpi-4.19.y: https://github.com/raspberrypi/linux/commit/65aa6ec0faaa012508489886ac357cbb86cdb9a4 It shouldn't break anything, and I think there's a good chance the data loss is fixed (although there may be a better implementation - this is more of a workaround).
Cool thanks @pelwell I can test it! What the fastest way to run this kernel on hassos? Or easier with raspbian?
There's likely to be a firmware build by the end of the day that will be available via rpi-update. Or you can build it yourself.
rpi-update contains the potential fix. Can you test it?
Unfortunately I'm abroad until Monday and as I said the environment where I can easily riproduce the bug is with home assistant distro hassos that does not have rpi-update I've to figure out how to update the firmware there. If someone have ideas..
$HCIATTACH /dev/serial1 bcm43xx 460800 noflow - $BDADDR
Did the change on HassOS with Raspberry 3 and seems to work
Can you please explain how this was done?
Having this exact same issues, updated from 4.19.86-v7+ to 4.19.97-v7+ (fdb5c37e330e7cb3027ac4fcc5b1cd5f244b351f). The Frame reassembly failed issue is still there, it keeps spamming my syslog with these messages, but I'll check if it stops crashing the bluetooth adapter now.
This problem is not solved yet with the latest firmware! Same issues occur, my bluetooth hardware crashed 8 hours ago.
$HCIATTACH /dev/serial1 bcm43xx 460800 noflow - $BDADDR
Did the change on HassOS with Raspberry 3 and seems to work
Can you please explain how this was done?
For a permanent change you've to change the read only HassOS fs https://unix.stackexchange.com/questions/8907/modifying-a-squashfs#8925
For a quicker but temporary solution:
login
ps aux | grep hciattach
and copy the running command killall hciattach
nohup hciattach /dev/serial1 bcm43xx 460800 noflow - YOURMACADDRESS &
You might have to restart home assistant, make sure not to reboot the system otherwise everything get lost.
It's disappointing that there appear to be Bluetooth issues beyond the UART data loss, but I had to eliminate that possibility first.
I’ve upgraded to HassOS 3.9 and I seem to be fixed. No failures yet!
I've been on HassOS 3.9 for a few days and I'm still seeing the issue. How long have you gone without error?
I spoke too soon, literally just went to Unavailable! It actually went got a good 8 hours or so.
On Thu, 6 Feb 2020 at 3:47 pm, Lachlan notifications@github.com wrote:
I've been on HassOS 3.9 for a few days and I'm still seeing the issue. How long have you gone without error?
On Thu, 6 Feb 2020, 6:44 pm talondnb, notifications@github.com wrote:
I’ve upgraded to HassOS 3.9 and I seem to be fixed. No failures yet!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/raspberrypi/firmware/issues/1150?email_source=notifications&email_token=ABDE65STRSRSHHOLHIL57QTRBO5U7A5CNFSM4HTWOVP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK6HTTA#issuecomment-582777292 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABDE65TUNDWIZR37URHOD7TRBO5U7ANCNFSM4HTWOVPQ
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/raspberrypi/firmware/issues/1150?email_source=notifications&email_token=AA6CPW7X7JNZHB7QPFTQMGLRBO6ALA5CNFSM4HTWOVP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK6H2MA#issuecomment-582778160, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6CPWZ5WS7CZKBF2TYQ62LRBO6ALANCNFSM4HTWOVPQ .
-- Regards,
Andrew Munday
Can I do a quick survey of failure scenarios? The following information would be helpful:
uname -a
)Kernel version Linux a0d7b954-ssh 4.19.93-v7 #1 SMP Mon Feb 3 19:47:23 UTC 2020 armv7l Linux OS version (e.g. Raspbian Buster - see /etc/os-release if you aren't sure) alpine Relevant non-standard configuration (HostAP, pulseaudio, ofono etc.) N/A Bluetooth usage (what you are using it for, approximate data rate) Miflora sensor monitoring WiFi usage (onboard or external, approximate data rate, output of iwconfig wlan0) Not used, eth0 only used Approximate average time to failure Since HassOS 3.9 update, around 8 hours. Before this, within minutes.
I've added Model of Pi to the list - I assume yours is a Zero W, @talondnb?
I've added Model of Pi to the list - I assume yours is a Zero W, @talondnb?
It's a Pi 3B. Non-plus model.
Model of Pi 3B+ Kernel version Linux a0d7b954-ssh 4.19.93-v7 #1 SMP Mon Feb 3 19:47:23 UTC 2020 armv7l Linux OS version Alpine 3.11 Relevant non-standard configuration Running hass.io Bluetooth usage Presence detection by tracking nearby Bluetooth devices WiFi usage None Approximate average time to failure Seems to be completely random. Could be 5 minutes, could be 5 hours.
Thanks. Does "Presence detection by tracking nearby Bluetooth devices" cover regular Bluetooth, BLE or both?
Model: 3B Kernel: Linux hassio 4.19.93-v7 #1 SMP Sun Jan 12 16:02:44 UTC 2020 armv7l Hassio/OS OS: HassOS 3.8 Relevant non-standard configuration: Running hass.io, connected with USB CC2531 (zigbee2mqtt.io) Bluetooth usage: Presence detection of mobile phones. Bluetooth, not BLE (at least I think: "platform: bluetooth_tracker") WiFi usage: none Approximate average time to failure: Random. 5min-5h
Model: 3B Kernel: Linux pi 4.19.97-v7+ #1294 SMP Thu Jan 30 13:15:58 GMT 2020 armv7l GNU/Linux OS: Raspbian Buster Relevant non-standard configuration: N/A Bluetooth usage: Collecting BLE sensor data through EspruinoHub – listens to BLE advertise packets, and sends them to MQTT WiFi usage: none (turned off via dtoverlay=disable-wifi) Approximate average time to failure: 30min–12h
Thanks. Does "Presence detection by tracking nearby Bluetooth devices" cover regular Bluetooth, BLE or both?
Just regular Bluetooth
I've got an installation of the hassio Home Assistant, since that seems like it might be a route to reproduce the failures. It booted up in Hebrew, but I've got past that now. Can someone give me a quick guide to configuring the presence detection?
I'm relying on my memory and 2-year old config, anyone, please add/correct.
Add to (existing) /config/configuration.yaml (e.g. to the end of the file):
- platform: bluetooth_tracker
new_device_defaults:
track_new_devices: true
Restart HA (in UI, Configuration --> Server control --> restart). That's it. Or the hard reboot in CLI "hassio host reboot" (reboot whole computer). I sometimes have to do this, the UI restart just never comes back up.
Optionally if the device doesn't appear shortly in your UI, you can also create "knowndevices.yaml" to /config folder (note mac address starts with "BT":
myiphone:
mac: BT_11:22:33:44:55:66
name: MyPrecious
--> the device tracker should appear in your UI. You can check the state in UI: Developer tools --> states. There should be a "device_tracker.myiphone", if you created the known.devices.yaml. Otherwise the myiphone part can be something else, can't remember. Possibly mac address.
I suppose you're familiar with yaml, but anyway the common note, every space matters. Copy the above as is.
Model of Pi Raspberry Pi 3 Model B Rev 1.2 Kernel version Linux hassio 4.19.93-v7 #1 SMP Sun Jan 12 16:02:44 UTC 2020 armv7l Hassio/OS OS version HassOS 3.8 Relevant non-standard configuration Running hass.io Bluetooth usage Reading Xiaomi BLE Temperature Humidity sensor WiFi usage Connected to wifi router Approximate average time to failure 2-4 hours more or less
Model of Pi: 3B Kernel version: Linux pi 4.19.97-v7+ OS version: Raspbian Buster Relevant non-standard configuration: Running hass.io Bluetooth usage: Reading Xiaomi BLE Temperature Humidity sensor WiFi usage: None Approximate average time to failure: 4-8 hours Side note: Running this non-permanent-fix now: "nohup hciattach /dev/serial1 bcm43xx 460800 noflow - YOURMACADDRESS &", still got errors in my syslog but not as much anymore, and haven't crashed since days now.
@johtajajake Thanks for the instructions. Unfortunately my installation (hassos_rpi3-3.9.img) doesn't have a "/config" directory, or anything similar:
# uname -a
Linux hassio 4.19.93-v7 #1 SMP Mon Feb 3 19:47:23 UTC 2020 armv7l Hassio/OS
# cat /etc/os-release
NAME=HassOS
VERSION="3.9 (RaspberryPi 3)"
ID=hassos
VERSION_ID=3.9
PRETTY_NAME="HassOS 3.9"
CPE_NAME=cpe:2.3:o:home_assistant:hassos:3.9:*:production:*:*:*:rpi3:*
HOME_URL=https://hass.io/
VARIANT="HassOS RaspberryPi 3"
VARIANT_ID=rpi3
That's strange. Did you follow the installation instructions in https://www.home-assistant.io/hassio/installation/ or some other way? See #7 in the instructions, even that's assuming there is the /config. Does the UI work? If yes, then the configuration.yaml must be somewhere.
I did follow the instructions, but I think my installation was broken from the start - possibly a bad card write. I thought it was suspicious when the UI came up in Hebrew.
With a clean install of 3.10 the config directory is there and I've got my phone being detected by Bluetooth.
Running Hassio on a 3B+ (which has flow control to the Bluetooth modem) I've found to be reliable (no problems in 24 hours), while on a 3B (with no flow control - this was the last design before we added the GPIO expander, freeing some pins on the SoC) I see the kind of instability that others have reported. I don't undertstand why Hassio is showing the problem more than Raspbian, but perhaps it is a scheduling issue to do with the kinds of workloads that Hassio requires.
I think Hassio should be modified to only use a baud rate of 460800 on a 3B, as that does make it much more reliable.
$HCIATTACH /dev/serial1 bcm43xx 460800 noflow - $BDADDR
Did the change on HassOS with Raspberry 3 and seems to work
Can you please explain how this was done?
$HCIATTACH /dev/serial1 bcm43xx 460800 noflow - $BDADDR
Did the change on HassOS with Raspberry 3 and seems to work
Can you please explain how this was done?
For a permanent change you've to change the read only HassOS fs https://unix.stackexchange.com/questions/8907/modifying-a-squashfs#8925
For a quicker but temporary solution:
- make sure to have ssh access to HassOS (not with the plugin)
login
ps aux | grep hciattach
and copy the running commandkillall hciattach
- run the copied command replacing the baud rate with 460800, prefixing nohup and postfixing with &:
nohup hciattach /dev/serial1 bcm43xx 460800 noflow - YOURMACADDRESS &
You might have to restart home assistant, make sure not to reboot the system otherwise everything get lost.
This workaround seems stable with 38400 speed limit . no Frame reassembly failed since 2 days instead of a few hours...
Running Hassio on a 3B+ (which has flow control to the Bluetooth modem) I've found to be reliable (no problems in 24 hours), while on a 3B (with no flow control - this was the last design before we added the GPIO expander, freeing some pins on the SoC) I see the kind of instability that others have reported. I don't undertstand why Hassio is showing the problem more than Raspbian, but perhaps it is a scheduling issue to do with the kinds of workloads that Hassio requires.
I think Hassio should be modified to only use a baud rate of 460800 on a 3B, as that does make it much more reliable.
I had RPi3B with Hassio and RPi3B+ with Hassbian. I switched them over. Now both have had stable BT for a couple of days. Which is nice. Of course doesn't solve the problem in hassio (I'd like to update hassbian to hassio). But a great workaround! Thanks!
hello, I have hassio over RPi3B, and I have an error to execute nohup process: ha > login ps aux | grep hciattach root 7586 0.0 0.0 3180 460 pts/0 S+ 12:19 0:00 grep hciattach killall hciattach killall: hciattach: no process killed nohup hciattach /dev/serial1 bcm43xx 38400 noflow - B8:27:EB:0A:xx:xx & nohup: can't open '/root/nohup.out': Read-only file system
[1]+ Done(127) nohup hciattach /dev/serial1 bcm43xx 38400 noflow - B8:27:EB:0A:xx:xx
after this I restart HomeAssistant, but bluetooth continue no working. Any ideas?
Thanks
Hi, It seems hciattach didn't start. nohup tried to write on /root... only user root can write on /root... Try to restart hciattach from /home/pi and check if the process is running, then restart HA
Hi Roy, how can get access to modify ? Im root...
Hello again, I bouth TP-Link UB400, and I disable RPi3 bluetooth with: dtoverlay=pi3-miniuart-bt I restart all system, but the problems continue, any help please¿¿ Thanks!!
this is happening to my setup as well Rpi3 model B with buster lite using onboard bluetooth when streaming audio get the timeout from hciconfig -a then bluetooth peers disconnect (a speaker in my case) modying /usr/bin/btuart, for permanent change, and setting the baud rate to 500000 seems okay (so far) with the 460800 rate my audio streaming quality was poor.
how can I change this parameter?
thanks
Use sudo to start your editor - "sudo nano /usr/bin/btuart", "sudo vi /usr/bin/btuart" etc.
Unless you are running with a read-only rootfs, in which case the answer will depend on the distribution you are running (Raspbian doesn't do this).
I'm using HASSos
ha > login sudo vi /usr/bin/btuart /bin/ash: sudo: not found
:(
How to use HASSos is outside the scope of the issue.
Then I can't do nothing?
@lucagiove From a quick search of the home-assistant issues page, you seem to have managed to edit /usr/bin/btuart. Can you explain to @mansig88 how you did that?
thanks @pelwell !!
Take care the modification of btuart I made last year on my PI has been cancelled by the update from Raspbian 9 to Raspbian 10. It seems that buster does not yet hold the correct code/config values.
But the modification of /usr/bin/btuart still works, hopefully ;-)
but how can I change on HassOS??
@mansig88 it's a bit advanced and the first update will overwrite the changes so not really worth. As soon as I've some spare time I'll try to see how can I submit a PR to change it in HassOS
thank you so much @lucagiove !!!
After the last system upgrade of Raspbian to:
I am observing a problem with bluetooth, which, after some time, stops working. The kernel log shows repeating message
or sometimes also a stack dump (I do not have one right now).
The RPi is 3B the dmesg log is here: dmesg.log
The HCI configuration is:
Now my BT usage is a bit specific. The main role for this RPi is running homeassistant, which means (among others) having: USB flash stick (with
f2fs
for logging), Z-Wave USB stick for home automation, Zigbee USB stick for home automation, and also using built-in BT for tracking BT hygrothermo devices.Apart from that, I also (ab)use the built-in BT for controlling Valve's lighthouses. Both the hygrothermo devices and the lighthouses use BTLE protocol. HT devices are read-out every 2 minutes, the lighthouses (when running) are talked to every 20 seconds.
When I do not run the lighthouses there is no communication with them and the error seems far less likely to happen. When I run a VR session, and run the lighthouse control, after some time, the BT becomes unresponsive, the errors are logged in the kernel log and the only resolution is a reboot.
Before the last system (and I assume also the firmware) upgrade, the system worked, in the exactly same configuration, fine, for several months.