wiglenet / wigle-wifi-wardriving

Nethugging client for Android, from wigle.net
https://wigle.net
BSD 3-Clause "New" or "Revised" License
696 stars 210 forks source link

Record Bluetooth UUID16s #614

Open XenoKovah opened 1 year ago

XenoKovah commented 1 year ago

This is a request to start collecting additional information from Bluetooth Low Energy Advertisements & Bluetooth Classic / EDR Extended Inquiry Responses, in order to better determine which company a device is associated with.

One of the types of information that can be advertised by a device is a UUID16. These can appear in either "complete" (type 3, https://developer.android.com/reference/android/bluetooth/le/ScanRecord#DATA_TYPE_SERVICE_UUIDS_16_BIT_COMPLETE) or "incomplete" / "partial"(type 2, https://developer.android.com/reference/android/bluetooth/le/ScanRecord#DATA_TYPE_SERVICE_UUIDS_16_BIT_PARTIAL) arrays of uint16s.

While the UUIDs are called "Service Class" UUIDs, usually when they are lower numbers (0x1XXX) and increasing they correspond to services (which can be looked up with https://bitbucket.org/bluetooth-SIG/public/src/main/assigned_numbers/uuids/service_uuids.yaml), but when they are higher numbers (0xFEXX) and decreasing they correspond to company IDs (which can be looked up with https://bitbucket.org/bluetooth-SIG/public/src/main/assigned_numbers/uuids/member_uuids.yaml).

The below screenshots are taken from looking at HCI logs, collected by the Linux btmon tool, in Wireshark (not raw packet captures.) However, the data is structured equivalently to what would be within a packet. And advertised information is expected to ultimately be exposed to applications within Android.

Here is an example of the Fitbit company ID UUID16 (0xfd63) appearing as part of an "incomplete" (type 2) list: Pasted Graphic

Here is an example of a service (0x1122 - Basic Printing) appearing in an "incomplete" (type 2) list: (for a Tesla, based on the name (^S[0-9a-f]{16}C$)- https://trifinite.org/Downloads/20220916_tempa_presentation_sec-t_public.pdf)

Pasted Graphic 3

Here is an example of the Logitech ID appearing as part of a "complete" (type 3) list:

Pasted Graphic 1

Note how there is another Company ID, for IBM, in the Manufacturer-specific data (type 0xff). Capturing that data will be requested in a different ticket. - #615

Here is another example of a "complete" (type 3) list (with Google appearing): Pasted Graphic 2

Note how there is another Company ID, for Google, in the Service data (type 0x16). Capturing that data will be requested in a different ticket. - #616

It is possible to have both company IDs and service IDs in the same list:

Pasted Graphic 5

NOTE: These type 2 (incomplete) and type 3 (complete) UUID16 lists can also appear in Bluetooth Classic Extended Inquiry Responses (where they are more likely to be part of an actual list).

Example 1 (incomplete list):

Pasted Graphic 8

Example 2 (complete list):

Pasted Graphic 9

I haven't been able to find where that BT Classic information would be in Android.

rksh commented 1 year ago

Initial experiments show these values appearing in less than 2% of scan results in live data.. We'll continue to explore.

rksh commented 1 year ago

The constants you list were added in API v33 - Android 13.

XenoKovah commented 1 year ago

Initial experiments show these values appearing in less than 2% of scan results in live data.. We'll continue to explore.

FWIW that seems a bit low. For instance when I look in my database for BLE data I have 141,526 records for (device_bdaddr, bdaddr_random, le_evt_type, device_name) whereas I have 261,759 records for (device_bdaddr, bdaddr_random, le_evt_type, list_type, str_UUID16s).

That said, I'm parsing HCI logs with wireshark, which erroneously cannot make a distinction between the fields it calls "btcommon.eir_ad.entry.uuid_16" in type 2 vs. 3 vs. 0x16 (and yes it calls it EIR even for LE). Specifically even though I may filter packets so that it must have type 2, that doesn't preclude it from also having type 0x16 in the same advertisement, which wireshark will extract as another instance of btcommon.eir_ad.entry.uuid_16. So I may have a little bit of double-counting in my data. But I don't think enough to lead to such a major difference from a 2% observation rate. (Edit: deleted the comment about low data count for #616 once I realized I haven't actually been importing that data.)

XenoKovah commented 1 year ago

The constants you list were added in API v33 - Android 13.

FWIW I think it's fine to hardcode those numbers, because they come from the BT Assigned Numbers document (https://btprodspecificationrefs.blob.core.windows.net/assigned-numbers/Assigned%20Number%20Types/Assigned_Numbers.pdf) section 2.3. There you can see 0x02 = "Incomplete List of 16-bit Service Class UUIDs" and 0x03 = "Incomplete List of 32-bit Service Class UUIDs".

And I just found those are also available in https://bitbucket.org/bluetooth-SIG/public/src/main/assigned_numbers/core/ad_types.yaml

rksh commented 1 year ago

in casual testing, the member uuids list appears unreliable/confused with real-world data. A pair of Bose NC 700 HP headphones provides the UUID16s for "Amazon.com Services, Inc." and "Google LLC" from my bench as well as the correct mapping.

0000fdd2-0000-1000-8000-00805f9b34fb, 0000fe26-0000-1000-8000-00805f9b34fb, 0000fe03-0000-1000-8000-00805f9b34fb

(1 is "Bose" (correct), 2 is "Google", 3 is "Amazon.com")

XenoKovah commented 1 year ago

At first I was going to speculate that perhaps it was using Amazon Sidewalk to allow for location of lost devices, but upon looking at the sales page I see it's advertising "easy access to voice assistants like Alexa and Google Assistant", so that seems like a more likely explanation for those additional UUIDs' appearance.

The following are examples of Bose devices (based on their names and or BDADDR OUIs) that contain only the single Bose UUID:

Bose AE2 SoundLink
Bose Color II SoundLink
Bose QuietComfort 35
Bose SoundLink Color II
Bose SoundSport 
HR-Bose SoundSport Pulse
LE-Black Diamond (possibly renamed, but there are a lot of them, so I'm not sure)
LE-Bose 2657
LE-Bose AE2 SoundLink
LE-Bose SoundSport Free
LE-Bose SoundWear 
LE-Bose Sport Earbuds
LE-Bose Sport Open Earbuds

etc

Here's some examples that have exactly the 3 you mentioned above

LE-Bose QC35 II
LE-Bose NC 700 HP
LE-Bose 700

And here's some that have 4, adding an additional Google UUID 0xfdd2:

LE-Bose QC35 II
LE-Black Gold (possibly renamed, but there are a lot of them, so I'm not sure)
LE-Bose QuietComfort 35 Se

(Note that it appears sometimes "LE-Bose QC35 II" has the 3 and sometimes 4 This could be due to a firmware update, different features enabled, different space in advertisement or response packets, etc. IDK. And it also advertises Alexa support, whereas something like Bose Sport Open Earbuds does not.)

But to the point of this ticket, here's some examples of devices with customized names, and randomized BDADDRs, where the only thing that would tell you for sure they were Bose headphones, is the fact that they have the UUIDs (and, kinda that Bose always keeps the "LE-" prefix even when people rename apparently. But that's a very weak heuristic which I'd rather not rely on.)

LE-DatGuy 
LE-Jet
LE-Luna
LE-QuietTime
LE-The Doctor

etc. (And per the above, the presence of 1, 3, or 4 UUIDs, gives a further hint at what models the device may be.)

rksh commented 1 year ago

Is there any significance implied by the ordering of the service UUIDs?

XenoKovah commented 1 year ago

Not AFAIK currently

XenoKovah commented 1 year ago

FWIW I found a citation now that 0xfe03 is specifically Amazon Alexa: https://developer.amazon.com/en-US/docs/alexa/alexa-gadgets-toolkit/bluetooth-le-settings.html

I've started to find that the reason some companies have multiple and use multiple 16-bit member uuids is to advertise company-specific services. E.g. for the Google member UUID of 0xfe2c it's actually their "Fast Pair" service (https://developers.google.com/nearby/fastpair/specifications/characteristics)

bobzilladev commented 1 year ago

This has been a high bang for the buck change, thank you for bringing this up! Local display will be in version 2.80 of the app, and going forward the project would like to locally store and centrally aggregate this data.

XenoKovah commented 1 year ago

Is the plan that once the server-side aggregation of the information is occurring, it will be visible and/or filterable through the "basic search" interface (which is the main thing I use when making screenshots for conference talks)? Or only through the API?

Because I could imagine doing a search where I filter results based on some company name like Nest (to show the improvement compared to the one example in the current talk where I can find based on a name of "N0001" but can't tell if it's a Nest device or not.)

p.s. I also updated my app to 2.8 while I was on travel and definitely liked the fact that now I could see all the various Apple devices called out as such, so that then I could see when there were other interesting things with unknown company names (e.g. "3D Display Technologies Co., Ltd.") showing up.