Closed st3fan closed 3 years ago
Looking at the device identifier caching code:
internal val deviceId by lazy {
var deviceId = preferences.getString(DEVICE_ID_KEY, null)
if (deviceId == null) {
deviceId = deviceIdGenerator.invoke()
preferences.edit().putString(DEVICE_ID_KEY, deviceId).apply()
}
deviceId
}
I think there is very little that can go wrong here. Before this change, the DEVICE_ID_KEY
preference did not exist. So the first thing that happens when a user installs this update is that we generate a new device id that then gets stored in the preferences.
It is looked up in start()
where a simple call to Leanplum.setDeviceId(deviceId)
is made, which I think is guaranteed to return the properly generated/cached device identifier.
I saw a few uplift requests for the caching patch, but I confirmed that it is present on the releases/81.1.1
branch.
Regarding migration from Fennec to Fenix - I don't think it matters what happens there because the deviceId
computed property will always generate a new ID the very first time it runs. And that ID will always be passed to Leanplum.setDeviceId()
- unless LP does some kind of internal caching.
I have a log from the failing device now which shows:
DeepLinkIntentProcessor: Invalid deep link: fenix://open?url=https%3A%2F%2Fsupport.mozilla.org%2Fen- \
US%2Fkb%2Fwhats-new-firefox-android&uid=872c42ec-0000-0000-0000-32b58f36153d
Looking at the code, this only happens if the User ID
that Leanplum puts in the URL does not match the deviceID
property, which takes the value from the preferences.
Why do those not match?
@jonalmeida @boek I see that this code changed from:
private val preferences = application.getSharedPreferences(PREFERENCE_NAME, MODE_PRIVATE)
to
private val preferences = StrictMode.allowThreadDiskReads().resetPoliciesAfter {
application.getSharedPreferences(PREFERENCE_NAME, MODE_PRIVATE)
}
Could that have any effect on how these preferences work?
Why is one called User ID
and the other deviceID
- are those the same .. or?
Reading the Leanplum SDK code - I think the deviceID we generate is not always used by leanplum as the ID that we think it is. I don't really understand this code or how Leanplum works well enough, but it is clear that Device ID and User ID are two different concepts.
We've made assumption that setting the Device ID results in a User ID with the same value. But looking at this code, I can't help to think that it is more complicated than that. There are multiple paths for the logic that determines that. On top of that, in the deeplinks we ask the Leanplum Backend to substitute {{User ID}}
- so that is definitely confusing: why do we set the Device ID but then target the User ID.
@jonalmeida @boek I see that this code changed from:
private val preferences = application.getSharedPreferences(PREFERENCE_NAME, MODE_PRIVATE)
to
private val preferences = StrictMode.allowThreadDiskReads().resetPoliciesAfter { application.getSharedPreferences(PREFERENCE_NAME, MODE_PRIVATE) }
Could that have any effect on how these preferences work?
No, this wrapping block only tells StrictMode to not report this as a disk read violation on the main UI thread.
Reading the Leanplum SDK code - I think the deviceID we generate is not always used by leanplum as the ID that we think it is. I don't really understand this code or how Leanplum works well enough, but it is clear that Device ID and User ID are two different concepts.
We've made assumption that setting the Device ID results in a User ID with the same value. But looking at this code, I can't help to think that it is more complicated than that. There are multiple paths for the logic that determines that. On top of that, in the deeplinks we ask the Leanplum Backend to substitute
{{User ID}}
- so that is definitely confusing: why do we set the Device ID but then target the User ID.
I just sent myself a deeplink that looks like this:
fenix://settings?uid={{Device ID}}
And that worked. I've asked @cbonacuse to try to same to see if that makes a difference.
And that worked. I've asked @cbonacuse to try to same to see if that makes a difference.
No change. Same results: notifications appear but links do not work.
so that is definitely confusing: why do we set the Device ID but then target the User ID.
I thought this was confusing too, when adding the deep link check..., but this was a 1:1 port from Fennec:
Regardless of the outcome of this issue, we may want to collect telemetry on deeplink rejections?
One thing we are pretty certain of: this happens on migrated installs. I reproduced this state by installing Fennec 68.11, then upgrading to Fenix 81.1.1 from the Play Store.
I uplifted #15584 to Fenix 81.1.2 and I rooted my Pixel 2XL to get easier access to the preferences of Firefox.
The following was tested on a device where I first installed Fennec 68.11 and then upgraded/migrated to Fenix 81.1.2.
I registered my device in the Leanplum console as a test device, using the Push Token that is printed during app startup. According to Leanplum my device is 4318eae1-829d-4066-8b1d-0a20372d3823
. Let's consider this the source of truth because this is the deviceId
that the SDK uses to identify me and what it uses to successfully communicate.
Here we go, at app startup, the following is now printed:
Starting Leanplum with device id: dba5cb2c-96e7-4fcf-87cf-0c1dbc5c1a2d
Started Leanplum with deviceId 4318eae1-829d-4066-8b1d-0a20372d3823 and userId 4318eae1-829d-4066-8b1d-0a20372d3823
And there is the problem. We initialize Leanplum with dba5cb2c-96e7-4fcf-87cf-0c1dbc5c1a2d
- we pass that into Leanplum.setDeviceId()
but when Leanplum has started, it has ignored the device Id that we asked it to use and used a new Id. (Spoiler: At the end you can read that this id not actually new but instead migrated from Fennec).
Opening a test notification prints:
Rejecting Leanplum deep link because uid 4318eae1-829d-4066-8b1d-0a20372d3823 does not
match dba5cb2c-96e7-4fcf-87cf-0c1dbc5c1a2d
Now, this code works as expected on new installs. So we must hit some weird corner case where Leanplum is picking up the old config files from Fennec and ignoring the deviceId
that we tell it to use.
Now that I can root shell into my device, it is easy to pick up /data/data/org.mozilla.firefox/shared_prefs/org.mozilla.gecko.BrowserApp.xml
which has the following:
<?xml version='1.0' encoding='utf-8' standalone='yes' ?>
<map>
<boolean name="android.not_a_preference.klar.package.installed" value="false" />
<string name="android.not_a_preference.leanplum.device_id">4318eae1-829d-4066-8b1d-0a20372d3823</string>
<boolean name="android.not_a_preference.focus.package.installed" value="false" />
</map>
We inherited the deviceId
that was previously used in Fennec. And it did not pick up the new identifier that Fenix generated.
So what went wrong here?
I thouight we hit some kind of bug in the Leanplum SDK where it does not properly pick up a custom deviceId
.. because:
Their documentation at https://docs.leanplum.com/reference#device-ids clearly says:
You can choose how the device ID will be set the first time start is called on that device by calling one of these before
start
:Leanplum.setDeviceId("customAndUniqueId")
Sets the device ID to a custom ID. Make sure that your custom ID is unique per device.
But then it also says this:
The
deviceId
is set whenLeanplum.start
runs for the first time on that device. After this, it cannot be changed unless the user completely uninstalls and reinstalls your app.
And I think that is exactly what we are running into now. You cannot change the deviceId
once it has been set, and on migrated builds the Leanplum SDK picks up the deviceId
that was previously configured in Fennec, and will not allow us to override that.
A fix should be simple: If org.mozilla.gecko.BrowserApp.xml
exists then pick up the android.not_a_preference.leanplum.device_id
value and use that.
And I think that is exactly what we are running into now. You cannot change the deviceId once it has been set, and on migrated builds the Leanplum SDK picks up the deviceId that was previously configured in Fennec, and will not allow us to override that.
A fix should be simple: If org.mozilla.gecko.BrowserApp.xml exists then pick up the android.not_a_preference.leanplum.device_id value and use that.
Ouch. Some time ago we tried to fix an issue in Fenix where we were generating a new ID on every startup. The good news is that Leanplum may have never used the new generated IDs. The bad news may be that our fix now generates and saves one that may not actually be used.
Two questions:
Why do we generate an ID in the first place? Why do we not use whatever Leanplum generates?
Because we don't like the Leanplum way of generating persistent identifiers. They do two things: their preferred method is to use the device MAC address. If that fails they fall back to Secure.ANDROID_ID
. Both are persistent - your MAC doesn't change (unless modern Android randomizes it, but I don't think it does) and Secure.ANDROID_ID
is based on app signature + google user id + device properties.
So both are persistent, which makes sense for a tool like Leanplum, except that we prefer to use something that randomizes after each install.
@st3fan is this something that we can QA now?
See: https://github.com/mozilla-mobile/fenix/issues/17373 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I am debugging a case where Leanplum is:
But what fails is opening deeplinks, specifically the following two:
fenix://open?url=https%3A%2F%2Fsupport.mozilla.org%2Fen-US%2Fkb%2Fwhats-new-firefox-android%2526utm_source%3Dleanplum-push-notification%2526utm_campaign%3D09.28.20-EN-fenix81-qa&uid={{User
fenix://settings?uid={{User ID}}
Note that these work properly for me (@st3fan) but not for @cbonacuse
In the Leanplum console I see that both our test devices are up to date (Firefox 81.1.1) and both have a push token assigned.
Since this works well for me, end-to-end, that gives me confidence that the
LeanplumMetricsService
andfenix://
intent handling works as expected. However, we did make two changes relatively recently related to theuid={{User ID}}
part of those URLs that I am trying to rule out:uid
parameter in the deeplink URL, which was fixed with https://github.com/mozilla-mobile/fenix/commit/77263aad704844782a461b49a92b51a8bed31c16It is important to rule these out because if push notification messages appear but the links are rejected that will most likely mean that the check for the device identifier fails.
This issue is to re-read / re-review those two patches and think about edge cases that could cause the caching or the check to fail. Keeping in mind the following scenarios:
cc to @boek and @pocmo
┆Issue is synchronized with this Jira Task