go-ble / ble

Bluetooth Low Energy for Linux / macOS
BSD 3-Clause "New" or "Revised" License
298 stars 107 forks source link

Stuck in linux/hci.(*HCI).handleLEAdvertisingReport #105

Open s5i opened 1 year ago

s5i commented 1 year ago

Every now and then my binary gets stuck in the following loop: (linux/hci/hci.go:418)

for idx := h.adLast - 1; idx != h.adLast; idx-- {
    if idx == -1 {
        idx = len(h.adHist) - 1
    }
    if h.adHist[idx] == nil {
        break
    }
    if h.adHist[idx].Addr().String() == sr.Addr().String() {
        h.adHist[idx].setScanResponse(sr)
        a = h.adHist[idx]
        break
    }
}

Consider the following conditions:

h.adLast == 0
len(h.adHist) == 1
h.adHist[0] != nil
h.adHist[0].Addr().String() != sr.Addr().String()

Then:

for idx := h.adLast - 1; idx != h.adLast; idx-- {
  // idx == -1
  if idx == -1 { // yup
    idx = len(h.adHist) - 1
    // idx = 0
  }
  if h.adHist[idx] == nil { /* nope */ }
  if h.adHist[idx].Addr().String() == sr.Addr().String() { /* nope */ }
  // idx-- at every iteration; we're back to idx == -1; the loop runs with the same values
}
jovandeginste commented 1 year ago

len(h.adHist) is a fixed value 128 (https://github.com/go-ble/ble/blob/master/linux/hci/gap.go#L32);

jovandeginste commented 1 year ago

I rewrote this code locally as a map instead of a fixed-length-array; if the problem was there, it should now be gone. The code is more readable now...

diff --git a/vendor/github.com/go-ble/ble/linux/hci/gap.go b/vendor/github.com/go-ble/ble/linux/hci/gap.go
index 85ea57f..dc725fc 100644
--- a/vendor/github.com/go-ble/ble/linux/hci/gap.go
+++ b/vendor/github.com/go-ble/ble/linux/hci/gap.go
@@ -29,8 +29,8 @@ func (h *HCI) Scan(allowDup bool) error {
        h.params.scanEnable.FilterDuplicates = 0
    }
    h.params.scanEnable.LEScanEnable = 1
-   h.adHist = make([]*Advertisement, 128)
-   h.adLast = 0
+   h.adHist = map[string]*Advertisement{}
+
    return h.Send(&h.params.scanEnable, nil)
 }

diff --git a/vendor/github.com/go-ble/ble/linux/hci/hci.go b/vendor/github.com/go-ble/ble/linux/hci/hci.go
index d673300..e080479 100644
--- a/vendor/github.com/go-ble/ble/linux/hci/hci.go
+++ b/vendor/github.com/go-ble/ble/linux/hci/hci.go
@@ -90,16 +90,15 @@ type HCI struct {
    addr    net.HardwareAddr
    txPwrLv int

-   // adHist and adLast track the history of past scannable advertising packets.
+   // adHist tracks the history of past scannable advertising packets.
    // Controller delivers AD(Advertising Data) and SR(Scan Response) separately
    // through HCI. Upon receiving an AD, no matter it's scannable or not, we
    // pass a Advertisement (AD only) to advHandler immediately.
    // Upon receiving a SR, we search the AD history for the AD from the same
    // device, and pass the Advertisiement (AD+SR) to advHandler.
-   // The adHist and adLast are allocated in the Scan().
+   // The adHist is allocated in the Scan().
    advHandler ble.AdvHandler
-   adHist     []*Advertisement
-   adLast     int
+   adHist     map[string]*Advertisement

    // Host to Controller Data Flow Control Packet-based Data flow control for LE-U [Vol 2, Part E, 4.1.1]
    // Minimum 27 bytes. 4 bytes of L2CAP Header, and 23 bytes Payload from upper layer (ATT)
@@ -408,30 +407,18 @@ func (h *HCI) handleLEAdvertisingReport(b []byte) error {
            fallthrough
        case evtTypAdvScanInd:
            a = newAdvertisement(e, i)
-           h.adHist[h.adLast] = a
-           h.adLast++
-           if h.adLast == len(h.adHist) {
-               h.adLast = 0
-           }
+           h.adHist[a.Addr().String()] = a
        case evtTypScanRsp:
            sr := newAdvertisement(e, i)
-           for idx := h.adLast - 1; idx != h.adLast; idx-- {
-               if idx == -1 {
-                   idx = len(h.adHist) - 1
-               }
-               if h.adHist[idx] == nil {
-                   break
-               }
-               if h.adHist[idx].Addr().String() == sr.Addr().String() {
-                   h.adHist[idx].setScanResponse(sr)
-                   a = h.adHist[idx]
-                   break
-               }
-           }
+
+           e, found := h.adHist[sr.Addr().String()]
+           if !found {
                // Got a SR without having received an associated AD before?
-           if a == nil {
                return fmt.Errorf("received scan response %s with no associated Advertising Data packet", sr.Addr())
            }
+
+           a = e
+           a.setScanResponse(sr)
        default:
            a = newAdvertisement(e, i)
        }
estutzenberger commented 7 months ago

Thanks for the changes. My main concern with this is that it could grow unbounded. I do not know have a history on the original implementation, but I suspect the bounded list was by choice. To keep with a map requires some sort of LRU that removes the oldest items once some limit is reached. Where as the previous implementation attended to both problems by using a circular buffer of advertisements.

jovandeginste commented 7 months ago

We may add a count of the number of keys as a safeguard?

estutzenberger commented 7 months ago

How is a full map handled after that? Usually I would get rid of the oldest item. Without that knowledge, you'd have to randomly delete. With that knowledge, you have to comb through the list to find it or keep track of the oldest item someway (which is what an LRU does). However, that seems like a lot of complexity for a section of code that should have low latency.

estutzenberger commented 7 months ago

To be clear, I'm not against making changes here.

jovandeginste commented 7 months ago

No I see your point. A solution to the unbounded grow will make it more complicated (it was more complicated before).

On the other hand, this code (with the map) runs for many months now, with little issues...

estutzenberger commented 7 months ago

For some projects the change may be fine. However, for example, we use this library in a product that sees 1000s of advertisements a day from devices which change their advertising data every 15 minutes. Very quickly this map would grow large on a resource constrained edge device. So I propose at least adding a light LRU such as this one to ensure the map doesn't grow unbounded: https://github.com/hashicorp/golang-lru/tree/main/simplelru