mozilla / ichnaea

Mozilla Ichnaea
http://location.services.mozilla.com
Apache License 2.0
573 stars 139 forks source link

Guess missing lac/cid for neighboring cells #17

Closed hannosch closed 2 months ago

hannosch commented 11 years ago

On Android we often get "incomplete" cell records for neighboring cells. They usually only have the mcc/mnc and psc fields but lack the lac/cid. While psc isn't unique at all (it's only 512 different values worldwide), it is unique in a certain area. Or rather neighboring cells cannot have the same psc values. So based on the lat/lon and mcc/mnc/psc we should be able to identify the lac/cid -> if we got at least one full record for that cell.

hannosch commented 10 years ago

Re-opening this one, as the code isn't actually activated yet. The code is merged but the task is deactivated in the worker.py schedule. Missing bits:

illarionov commented 10 years ago

My analysis shows that two neighboring cells may have the same PSC. For example, now my phone is connected to CID=14515773 with PSC=31 and see another cell with PSC=31 in neighbours. Sometimes at the same place phone reconnects to CID=14515769 PSC=31.

Often the same PSC is assigned to all the cells of a single UMTS base station.

hannosch commented 10 years ago

That's indeed something we need to take into account. Both of these cell-ids are for the same RNC (14515773 >> 16 == 221, 14515769 >> 16 == 221) but have different node-ids (14515769 % 65536 == 32313, 14515773 % 65536 == 32317). See #143 for the background on that.

hannosch commented 10 years ago

One other thing that was mentioned to me: PSC/PCI values might not be quite as stable as lac/cid assignments. We need a process to update the PSC/PCI when observations start reporting a new value.

I think we currently only assign the PSC/PCI during the initial insert into the cell table. It only changes if the cell gets blacklisted/removed from the table and later makes it back onto it.

E3V3A commented 10 years ago

This "guessing" seem like a very bad idea. The probable reason why you don't always get LAC/CID is that the device is connected with a different cell technology (RAT). The most common being: GSM, CDMA, WCDMA, UMTS and LTE. All these have different requirements regarding what to present as relevant cell info. Then it's a much better idea, to add a few DB tables with the RAT info, in order not to mix them up. In addition there is no telling what would happen, when people and companies start hooking up their pico-cells all over the place. Not to forget IMSI-catchers...

hannosch commented 10 years ago

@E3V3A I'll get a bit technical here, since I think you are interested in the technical underpinnings. Please excuse me if this is too detailed and I'll write a shorter answer :)

The most common cause of "incomplete" cell ids is missing software or hardware support. In particular most Android phones only support the TelephonyManager.getNeighboringCellInfo API (http://developer.android.com/reference/android/telephony/TelephonyManager.html#getNeighboringCellInfo%28%29). This is backed by the RIL_REQUEST_NEIGHBORING_CELL_IDS code, using a RIL_NeighboringCell struct from https://github.com/android/platform_hardware_ril/blob/c6bb97274337f55fec2e7f33aec7acc9de117ddf/include/telephony/ril.h#L373. In GSM networks it contains the lac/cid combination, in UMTS networks it only contains the PSC value. This API supports neither CDMA nor LTE. So for UMTS networks you can only get the PSC value for neighboring cells via this API.

In later versions of Android, a new set of API's with TelephonyManager.getAllCellInfo was introduced (http://developer.android.com/reference/android/telephony/TelephonyManager.html#getAllCellInfo%28%29), which is backed by RIL_REQUEST_GET_CELL_INFO_LIST. This is using a whole new set of structs for signal strength, cell identity and cell info (https://github.com/android/platform_hardware_ril/blob/c6bb97274337f55fec2e7f33aec7acc9de117ddf/include/telephony/ril.h#L678). These API's are in principal detailed enough to forward any information the underlying cell modem exposes.

For Firefox OS, we are also based on the same underlying Android platform hardware layer. The equivalent of the getNeighboringCellInfo API was implemented in https://bugzilla.mozilla.org/show_bug.cgi?id=1010356 and should be available in FxOS 2.0. The equivalent to getAllCellInfo is still being worked on, the work being tracked in https://bugzilla.mozilla.org/show_bug.cgi?id=1032858 and its dependencies. The FxOS based stumblers don't yet use either of these API's, but they'd also start with the getNeighboringCellInfo-like API, as it is available on an earlier FxOS release.

The final problem is that the way in which the OS is getting the neighboring cell info isn't actually standardized as far as I know. What is standardized is the way in which to get the signal strength readings. This is part of 3GPP TS 127.007. For the longest time there was only section 8.5 with the AT command +CSQ. In a later revision of the standard 8.69 and the AT command +CESQ was added to surface the more detailed and correct signal strength values for various network types.

But when it comes to how to actually get the neighboring cell info, there is to my knowledge only a set of non-standardized AT commands in the form of +MONI, +MONP, +SMONC and +SMOND. It depends on the particular chipset and RIL support for the chipset, whether or not any of these commands are available and actually used. So even if the Android OS version supports the new getAllCellInfo API, it's still a question of the chipset and RIL combination to what extend the API actually returns any values. The Android reference-ril.c conveniently implements the requestGetCellInfoList by simply returning the serving GSM cell, with the actual values coming from the +CREG/+CGREG commands (https://github.com/android/platform_hardware_ril/blob/c6bb97274337f55fec2e7f33aec7acc9de117ddf/reference-ril/reference-ril.c#L1901).

All of these are issues why on most phones out there only partial information is available to the OS/app level, while the radio modem might have more complete data. The motivation for the PSC-based guess work is trying to fill in this gap and make use of more than the serving cell. With the serving cell info alone, the possibly accuracy for location lookups is rather limited, as only a single point of information exists. And on the stumbling data side, most of the cell records we have are incomplete in the same manner, so extracting some value out of them would greatly increase the data set being used for position and size estimation of the aggregated cells.

E3V3A commented 10 years ago

No problem, I love technical underpinnings. For the non-expert, here is another explanation:

The basic problem is that all Google services use the MCC/MNC/LAC/CID to approximate your location via the cell tower (for non-CDMA systems). This is a GSM/UMTS based concept which is different in LTE. LTE only has MCC and MNC, then it has PCI/TAC/CI which are somewhat equivalent to LAC/CID.

PCI is basically something like a subset of CID, with CI being something that is more like CID. However, CI is 28 bit but CID is 16 bit, so they can't be equivalent (can't fit). TAC and LAC are the same size, so they could be equivalent. However, Google would need to use this convention when mapping out cell network for geolocation as well.

Google "fixed" this in Android 4.2, by not forcing manufacturers to convert LTE cell info to GSM cell info by implementing CellIdentityLte,CellIdentityGsm, and CellIdentityCdma. However, the Nexus 4 doesn't implement the 4.2 cellid API (it returns null instead of one of the above). Instead, it relies on the old mechanism.

My theory on what happens on the Nexus 4 is the propriatary Qualcomm RIL tries to return the LAC/CID for the circuit switched fallback network (GSM or UMTS) which should be fine for geolocation, but it's not very well impemented. So on some systems (it appears Bell/Telus have a higher sucess rate than Rogers) it does work, but others it doesn't. It could be due to LTE base station software/manufacturer/etc or perhaps something internal to the Nexus 4 driver.

hannosch commented 10 years ago

Regarding the CID, for GSM it is indeed a 16bit value. For UMTS it's supposed to be the UTRAN cell id, a combination of RNC (12 or 16 bit) and cell id (16 bit), so typically 28 bit or with extended RNC 32 bit. The CI for LTE is again a 28 bit value.

PSC/PCI are not part of the unique number to identify a cell, but are related to scrambling/encoding of the actual signal on the physical layer. UMTS and LTE both us code-division (CDMA), so these serve a similar role as the frequency/channel for WiFi networks which use frequency-division (FDMA). In both cases these divisions should be different for neighboring networks to avoid interference. With the limited number of available channels in WiFi networks that isn't possible. For the much more fixed cell networks and over 500 available codes, neighboring cells from the same operator should indeed never have overlapping codes.