Open shankari opened 1 month ago
Here's what I see in the logs:
Previous trip ended at 8:21
797,1727018467.76015,2024-09-22T08:21:07.760150-07:00,"In TripDiaryStateMachine, received transition T_DATA_PUSHED in state STATE_WAITING_FOR_TRIP_START"
New trip started at 13:50, including a VISIT_ENDED
833,1727038244.95052,2024-09-22T13:50:44.950520-07:00,"In TripDiaryStateMachine, received transition T_EXITED_GEOFENCE in state STATE_WAITING_FOR_TRIP_START"
836,1727038244.95445,2024-09-22T13:50:44.954450-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_WAITING_FOR_TRIP_START"
844,1727038244.96454,2024-09-22T13:50:44.964540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_ONGOING_TRIP"
847,1727038244.96749,2024-09-22T13:50:44.967490-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
856,1727038245.60322,2024-09-22T13:50:45.603220-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1132,1727038252.92264,2024-09-22T13:50:52.922640-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1162,1727038267.53891,2024-09-22T13:51:07.538910-07:00,"In TripDiaryStateMachine, received transition T_VISIT_ENDED in state STATE_ONGOING_TRIP"
Received some more (null) transitions and restarts
1272,1727038591.74358,2024-09-22T13:56:31.743580-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1431,1727038599.1122,2024-09-22T13:56:39.112200-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1620,1727038892.42846,2024-09-22T14:01:32.428460-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1898,1727038899.855,2024-09-22T14:01:39.855000-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1973,1727039193.41537,2024-09-22T14:06:33.415370-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
2251,1727039200.84957,2024-09-22T14:06:40.849570-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
Until we received a VISIT_STARTED and ended the trip
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
2323,1727039501.05468,2024-09-22T14:11:41.054680-07:00,"In TripDiaryStateMachine, received transition T_TRIP_END_DETECTED in state STATE_ONGOING_TRIP"
2346,1727039502.19317,2024-09-22T14:11:42.193170-07:00,"In TripDiaryStateMachine, received transition T_END_TRIP_TRACKING in state STATE_ONGOING_TRIP"
2348,1727039502.27354,2024-09-22T14:11:42.273540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_ENDED in state STATE_ONGOING_TRIP"
The next exit was a substantial time away so is not part of this fragmented trip.
8401,1727049760.00312,2024-09-22T17:02:40.003120-07:00,"In TripDiaryStateMachine, received transition T_EXITED_GEOFENCE in state STATE_WAITING_FOR_TRIP_START"
There are 5 clusters of points in the map, and 5 trip segments in the logs. So the first order of business is to look at why we are ending the trips so many times.
@JGreenlee for visibility
We are ending trips because of the VISIT transitions
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
2323,1727039501.05468,2024-09-22T14:11:41.054680-07:00,"In TripDiaryStateMachine, received transition T_TRIP_END_DETECTED in state STATE_ONGOING_TRIP"
2346,1727039502.19317,2024-09-22T14:11:42.193170-07:00,"In TripDiaryStateMachine, received transition T_END_TRIP_TRACKING in state STATE_ONGOING_TRIP"
2348,1727039502.27354,2024-09-22T14:11:42.273540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_ENDED in state STATE_ONGOING_TRIP"
Looking at the logs around that
2316,1727039501.04798,2024-09-22T14:11:41.047980-07:00,"Received visit notification = <+45.51667792,-122.65693038> +/- 59.58m (2024-09-22 21:06:47 +0000 to -) (null)"
2319,1727039501.05051,2024-09-22T14:11:41.050510-07:00,"departure date is 4001-01-01 00:00:00 +0000, isDistantDate? 1, after distantDate? 0 "
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
2321,1727039501.05311,2024-09-22T14:11:41.053110-07:00,Checking invalid visit started transition...
2322,1727039501.05415,2024-09-22T14:11:41.054150-07:00,"visit started transition is not invalid, returning false"
We do check for invalid visit started transitions here: https://github.com/e-mission/e-mission-data-collection/blob/06a90be0db66402b13bd6c01e5676936857c2f04/src/ios/Location/TripDiaryStateMachine.m#L600
However, in this case, neither check was successful. Because of the restarts, there were multiple transitions. And the last two transitions were
1973,1727039193.41537,2024-09-22T14:06:33.415370-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
2251,1727039200.84957,2024-09-22T14:06:40.849570-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
So we didn't even check for the deltaTs.
And even if we had found the correct transitions and checked for the deltaTs, it would have been more than a few minutes. This really does not seem like a spurious visit transition - it was sufficiently after the trip start that it seems meaningful.
1162,1727038267.53891,2024-09-22T13:51:07.538910-07:00,"In TripDiaryStateMachine, received transition T_VISIT_ENDED in state STATE_ONGOING_TRIP"
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
The visit API is a black box, so we are not able to look into it or understand why it is generating these spurious visit transitions. All we can try to do at this point is to add additional checks that detect that the visit transition was invalid and ignore it.
Since the transition here was several minutes after the trip start, the only possible check would be that the user was still moving - i.e. that we were receiving location updates that represented movement consistently. If so, we can ignore the VISIT_STARTED notifications.
Looking at the location logs, it looks like this won't really work. The last log statement before the VISIT_STARTED
transition was from 14:07
2300,1727039233.09842,2024-09-22T14:07:13.098420-07:00,"Recieved location <+45.51682256,-122.65781566> +/- 2.53m (speed 4.99 mps / course 180.06) @ 9/22/24, 2:07:13 PM Pacific Daylight Time, ongoing trip = true"
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
Verified using uploaded data
Let's see what is happening between 14:07 and 14:11 in the logs. Nothing.
We get the last location input that we see, and then we don't see any other logs until we initialize the plugins as part of restarting the app, just before we receive the VISIT transition.
2300,1727039233.09842,2024-09-22T14:07:13.098420-07:00,"Recieved location <+45.51682256,-122.65781566> +/- 2.53m (speed 4.99 mps / course 180.06) @ 9/22/24, 2:07:13 PM Pacific Daylight Time, ongoing trip = true"
2301,1727039500.42191,2024-09-22T14:11:40.421910-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler
2302,1727039500.42901,2024-09-22T14:11:40.429010-07:00,"intro_done result = {
""intro_done"" = ""2024-08-28T09:57:43.879-07:00"";
}"
2303,1727039500.43044,2024-09-22T14:11:40.430440-07:00,"Auth found in local storage, now it should be stable"
2304,1727039500.43115,2024-09-22T14:11:40.431150-07:00,finished init of iOS native code
2311,1727039500.99902,2024-09-22T14:11:40.999020-07:00,started fine location tracking with accuracy = -2 and distanceFilter = 1
With these logs, and the iOS distance filter, it is not 100% clear whether:
Given that we see the pluginInitialize
messages, (2) seems more likely.
But it would be good to get some indication from the user whether they actually did stop moving at that time. Because, we do try to detect when the app has been terminated, and I don't see that log either. https://github.com/e-mission/e-mission-data-collection/blob/06a90be0db66402b13bd6c01e5676936857c2f04/src/ios/BEMAppDelegate.m#L103
Poking around a bit more at the termination, note (from https://developer.apple.com/documentation/corelocation/cllocationmanager/startupdatinglocation()) that
If you start this service and your app is suspended, the system stops the delivery of events until your app starts running again (either in the foreground or background). If your app is terminated, the delivery of new location events stops altogether. Therefore, if your app needs to receive location events while in the background, it must include the UIBackgroundModes key (with the location value) in its Info.plist file.
We do have the UIBackgroundModes
key set https://github.com/e-mission/e-mission-data-collection/blob/06a90be0db66402b13bd6c01e5676936857c2f04/plugin.xml#L206 so the app should be running in the background.
But while the background modes protect the app from being terminated because it went off the foreground, it clearly does not prevent it from being terminated due to resource constraints.
Looking at documentation around termination: https://developer.apple.com/documentation/xcode/reduce-terminations-in-your-app it looks like we can view termination reasons in Xcode, or use the new MetricKit (available in iOS 13+).
We now require a minimum of iOS 13+ so that is definitely an option.
Checked the organizer - it does not have enough metrics. Which seems to argue against the reason being termination if this is consistent behavior for Nathan and Joe.
Note also that, although we noticed that there was a restart around the time we had no locations, we also had multiple restarts that were not correlated with lack of locations.
Focusing on the transitions:
844,1727038244.96454,2024-09-22T13:50:44.964540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_ONGOING_TRIP"
847,1727038244.96749,2024-09-22T13:50:44.967490-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
856,1727038245.60322,2024-09-22T13:50:45.603220-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1132,1727038252.92264,2024-09-22T13:50:52.922640-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1162,1727038267.53891,2024-09-22T13:51:07.538910-07:00,"In TripDiaryStateMachine, received transition T_VISIT_ENDED in state STATE_ONGOING_TRIP"
1272,1727038591.74358,2024-09-22T13:56:31.743580-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1431,1727038599.1122,2024-09-22T13:56:39.112200-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1620,1727038892.42846,2024-09-22T14:01:32.428460-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1898,1727038899.855,2024-09-22T14:01:39.855000-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
Checking the location logs around those, we see that there seem to be pauses for ~ 5 minutes, followed by reinitializing, and then seeing locations again
1244,1727038321.09963,2024-09-22T13:52:01.099630-07:00,"Recieved location <+45.49754485,-122.60493378> +/- 2.14m (speed 14.43 mps / course 270.27) @ 9/22/24, 1:52:00 PM Pacific Daylight Time, ongoing trip = true"
1245,1727038590.47854,2024-09-22T13:56:30.478540-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler
1246,1727038590.48451,2024-09-22T13:56:30.484510-07:00,"intro_done result = {
""intro_done"" = ""2024-08-28T09:57:43.879-07:00"";
}"
1247,1727038590.48536,2024-09-22T13:56:30.485360-07:00,"Auth found in local storage, now it should be stable"
1265,1727038591.13447,2024-09-22T13:56:31.134470-07:00,"Recieved location <+45.49730524,-122.62361088> +/- 2.01m (speed 14.08 mps / course 263.37) @ 9/22/24, 1:56:30 PM Pacific Daylight Time, ongoing trip = true"
1267,1727038591.64683,2024-09-22T13:56:31.646830-07:00,"Recieved location <+45.49729159,-122.62379686> +/- 2.25m (speed 14.58 mps / course 263.05) @ 9/22/24, 1:56:31 PM Pacific Daylight Time, ongoing trip = true"
--------
1594,1727038631.09737,2024-09-22T13:57:11.097370-07:00,"Recieved location <+45.49683290,-122.63151200> +/- 2.00m (speed 14.95 mps / course 271.40) @ 9/22/24, 1:57:11 PM Pacific Daylight Time, ongoing trip = true"
1595,1727038891.24254,2024-09-22T14:01:31.242540-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler
1605,1727038891.79792,2024-09-22T14:01:31.797920-07:00,started fine location tracking with accuracy = -2 and distanceFilter = 1
1615,1727038891.83103,2024-09-22T14:01:31.831030-07:00,"Recieved location <+45.50487549,-122.64892057> +/- 2.01m (speed 10.75 mps / course 270.39) @ 9/22/24, 2:01:31 PM Pacific Daylight Time, ongoing trip = true"
1617,1727038892.34075,2024-09-22T14:01:32.340750-07:00,"Recieved location <+45.50487412,-122.64905817> +/- 2.00m (speed 10.73 mps / course 270.39) @ 9/22/24, 2:01:32 PM Pacific Daylight Time, ongoing trip = true"
Next steps:
kCLDistanceFilterNone
so that we can experiment with no distance filter to collect data even when the user is not moving, although it may result in higher battery drain
https://developer.apple.com/documentation/corelocation/kcldistancefilternone?language=objcOne other observation from the logs before I forget.
I noticed that there were several UI logs that were overlapping with the start of this trip
e.g.
833,1727038244.95052,2024-09-22T13:50:44.950520-07:00,"In TripDiaryStateMachine, received transition T_EXITED_GEOFENCE in state STATE_WAITING_FOR_TRIP_START"
836,1727038244.95445,2024-09-22T13:50:44.954450-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_WAITING_FOR_TRIP_START"
844,1727038244.96454,2024-09-22T13:50:44.964540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_ONGOING_TRIP"
847,1727038244.96749,2024-09-22T13:50:44.967490-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
but then
1372,1727038592.89993,2024-09-22T13:56:32.899930-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
1385,1727038594.04323,2024-09-22T13:56:34.043230-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
...
The UI load seems to have started at around
929,1727038246.42178,2024-09-22T13:50:46.421780-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//pipeline/get_range_ts
937,1727038246.43513,2024-09-22T13:50:46.435130-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//profile/update
948,1727038246.4863,2024-09-22T13:50:46.486300-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//profile/update
955,1727038246.96463,2024-09-22T13:50:46.964630-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
...
1007,1727038248.78387,2024-09-22T13:50:48.783870-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
1330,1727038592.50981,2024-09-22T13:56:32.509810-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//pipeline/get_range_ts
1354,1727038592.54016,2024-09-22T13:56:32.540160-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//profile/update
1367,1727038592.88539,2024-09-22T13:56:32.885390-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
...
1419,1727038594.70805,2024-09-22T13:56:34.708050-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
1679,1727038892.86086,2024-09-22T14:01:32.860860-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//pipeline/get_range_ts
and if we look at the periods where the app appeared to restart, we were still loading data when the app appeared to be terminated.
1240,1727038318.74052,2024-09-22T13:51:58.740520-07:00,"fetchNominatimLocName: found cached response for -122.60380015564107,45.503289315482654"
1241,1727038319.10062,2024-09-22T13:51:59.100620-07:00,"Recieved location <+45.49754129,-122.60456163> +/- 2.01m (speed 14.13 mps / course 270.00) @ 9/22/24, 1:51:58 PM Pacific Daylight Time, ongoing trip = true"
1242,1727038320.0978,2024-09-22T13:52:00.097800-07:00,"Recieved location <+45.49753932,-122.60475056> +/- 2.00m (speed 14.58 mps / course 270.27) @ 9/22/24, 1:51:59 PM Pacific Daylight Time, ongoing trip = true"
1243,1727038320.74276,2024-09-22T13:52:00.742760-07:00,"fetchNominatimLocName: found cached response for -122.68062146450143,45.510185801655595"
1244,1727038321.09963,2024-09-22T13:52:01.099630-07:00,"Recieved location <+45.49754485,-122.60493378> +/- 2.14m (speed 14.43 mps / course 270.27) @ 9/22/24, 1:52:00 PM Pacific Daylight Time, ongoing trip = true"
1245,1727038590.47854,2024-09-22T13:56:30.478540-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler
@JGreenlee I wonder if this is due to the fact that we load data on every resume now.
If we are getting terminated due to resource constraints, the problem may be:
I would guess that this doesn't trigger for the test phone because it is not running anything else (so the resources are not constrained) and there aren't a lot of recent trips anyway since I don't carry the test phone everywhere.
It seems like a general optimization would be to not load the diary if the app has been launched in the background since nobody is watching anyway. Need to see if the suspend/resume/launch notifications let us determine whether the launch was in the foreground or the background.
I am surprised that the webview is even able to run while the app is not in the foreground. (I am guessing this iOS-specific because I don't think it would be possible on Android.)
It seems like a general optimization would be to not load the diary if the app has been launched in the background since nobody is watching anyway.
We certainly do not want to be loading the diary when in the background – especially because when the app is resumed from the background, it will all be reloaded again anyway.
Need to see if the suspend/resume/launch notifications let us determine whether the launch was in the foreground or the background.
Yes it is possible via React Native's AppState API https://reactnative.dev/docs/0.71/appstate
Currently, we use this to determine when the app is resumed from the background (state transitions to active
).
However, we don't check AppState
on a cold launch.
I am surprised that the webview is even able to run while the app is not in the foreground. (I am guessing this iOS-specific because I don't think it would be possible on Android.)
Yes, on iOS, the webview keeps running even when the app is in the background. For the trip-end notifications, this was actually helpful, here it is annoying 😄
wrt (1) from https://github.com/e-mission/e-mission-docs/issues/1095#issuecomment-2435841733
PSU also reported this trip (to a concert at around 5pm, returning at around 9:15pm). This was a freeway car trip, but still has clusters of location points.
This seems to argue for the app termination as the cause of the issue. Let's see if this is borne out by the logs.
Two main takeaways from that deep dive:
17:37:19
and 18:20:00
without a VISIT transitionTesting out the change, I see
23258,1729999919.70029,2024-10-26T20:31:59.700290-07:00,"In TripDiaryStateMachine, received transition T_EXITED_GEOFENCE in state STATE_WAITING_FOR_TRIP_START"
23269,1729999919.9377,2024-10-26T20:31:59.937700-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_ONGOING_TRIP"
23270,1729999920.02693,2024-10-26T20:32:00.026930-07:00,Current state of region STATIONARY_GEOFENCE_LOCATION is 2 (outside)
23271,1729999920.0396,2024-10-26T20:32:00.039600-07:00,Received OUTSIDE geofence state when currState = STATE_ONGOING_TRIP
23272,1729999920.05051,2024-10-26T20:32:00.050510-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
23798,1730000557.73419,2024-10-26T20:42:37.734190-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
23801,1730000558.04736,2024-10-26T20:42:38.047360-07:00,"In TripDiaryStateMachine, received transition T_TRIP_END_DETECTED in state STATE_ONGOING_TRIP"
23814,1730000559.35952,2024-10-26T20:42:39.359520-07:00,"In TripDiaryStateMachine, received transition T_END_TRIP_TRACKING in state STATE_ONGOING_TRIP"
23817,1730000559.45782,2024-10-26T20:42:39.457820-07:00,Moved from STATE_ONGOING_TRIP to STATE_WAITING_FOR_TRIP_START
23859,1730000868.93481,2024-10-26T20:47:48.934810-07:00,Application will enter the foreground
23860,1730000869.03664,2024-10-26T20:47:49.036640-07:00,"addStatReading: adding CLIENT_TIME event: {""name"":""app_state_change"",""ts"":1730000868.919,""reading"":""active"",""client_app_version"":""1.9.4"",""client_os_version"":""15.7.5""}"
23861,1730000869.05134,2024-10-26T20:47:49.051340-07:00,"addStatReading: adding CLIENT_TIME event: {""name"":""app_state_change"",""ts"":1730000868.925,""reading"":""active"",""client_app_version"":""1.9.4"",""client_os_version"":""15.7.5""}"
egrep "2024-10-26.*(app_state_change|pluginInitialize|T_TRIP_STARTED|T_TRIP_END_DETECTED)" /tmp/loggerDB.txt.withdate.log
) across the day shows a very similar pattern Looks promising; let's push it out to production and see if our partners at PSU are able to reproduce the original issue.
Pushed it out to production after some delays. However, PSU reported that they were still seeing gaps in the reported locations. Checking the logs, we see that the app is no longer crashing, but there are arbitrary 5 minute gaps in the data collection.
In some cases, the speed is low
29578,1730822372.89939,2024-11-05T07:59:32.899390-08:00,"Recieved location <...> +/- 13.93m (speed 1.36 mps / course -1.00) @ 11/5/24, 7:59:32 AM Pacific Standard Time, ongoing trip = true"
29579,1730823787.73012,2024-11-05T08:23:07.730120-08:00,"Recieved location <...> +/- 12.41m (speed 1.36 mps / course -1.00) @ 11/5/24, 7:59:33 AM Pacific Standard Time, ongoing trip = true"
But in a lot of cases it is not. The app is clearly not restarting (no pluginInitialize
); iOS just decides to not stop giving us location points for a while.
29621,1730823829.09617,2024-11-05T08:23:49.096170-08:00,"Recieved location <...> +/- 3.67m (speed 4.46 mps / course 270.71) @ 11/5/24, 8:23:49 AM Pacific Standard Time, ongoing trip = true"
29622,1730823961.2628,2024-11-05T08:26:01.262800-08:00,"Recieved location <...> +/- 29.74m (speed -1.00 mps / course -1.00) @ 11/5/24, 8:26:01 AM Pacific Standard Time, ongoing trip = true"
29661,1730824002.09654,2024-11-05T08:26:42.096540-08:00,"Recieved location <...> +/- 6.48m (speed 4.52 mps / course 269.90) @ 11/5/24, 8:26:42 AM Pacific Standard Time, ongoing trip = true"
29665,1730824262.14861,2024-11-05T08:31:02.148610-08:00,"In TripDiaryStateMachine, received transition T_DATA_PUSHED in state STATE_ONGOING_TRIP"
29676,1730824272.09609,2024-11-05T08:31:12.096090-08:00,"Recieved location <...> +/- 5.74m (speed 4.89 mps / course 265.95) @ 11/5/24, 8:31:12 AM Pacific Standard Time, ongoing trip = true"
29677,1730824563.13458,2024-11-05T08:36:03.134580-08:00,"Recieved location <...> +/- 5.58m (speed 6.02 mps / course 268.36) @ 11/5/24, 8:36:03 AM Pacific Standard Time, ongoing trip = true"
Let us see if we see similar behavior on the test phone. It would help a lot if we are able to reproduce; we may need to look at the phone logs instead of only the app logs.
Alas, this is not reproducible on my test iphone. I plotted the trajectories from one of my trips, and we can even see the difference between two separate trips along the same major street.
Given that the OS throttling the points that we get, the most likely culprit is a different version of the OS
Checking the logs, it looks like my test phone is on 15.7.5
and the PSU phone is on 18.0.1
. I am not able to find any highlighted changes to location services between iOS 15 and iOS 18.
I think we have two potential next steps:
Just to make doubly sure, we do set everything that is needed for the location code.
We do have allowsBackgroundLocationUpdates
set
And we listen for locationManagerDidPauseLocationUpdates
and log when that happens, and I didn't see that log.
While waiting to figure out the versions of the test phones that we have, I pulled some data from a program that had predominantly iOS users. I pulled data for two days 2024-05-29
and 2024-05-31
and plotted the trajectories for a random sample of users.
The first two had perfect trajectories[1] so I was feeling fairly optimistic, and then the third one had spotty trajectories!!! Same dates, same general physical location, same app. Let's dig into the differences between these user phones.
[1] I also noticed that even for the good trajectories, the points were clustered together more closely off the freeway than on (aka slower speeds = closer points). Since iOS has a distance filter, this should not be the case, but I guess the phone was going fast enough that it overwhelmed the distance filter.
The three trajectories, plotted with style='.-'
. Note that the lines on the first two are very thick because the points are so close together. The lines on the last are thin, because they are between isolated groups of points
user_0 | user_14 | user_3 |
---|---|---|
I thought this was an almost exclusively iOS dataset, but apparently not (at least for the dates that we were testing). When plotting the boxplot of the distances (with outliers trimmed), we see that most of the phones are actually android.
Let me clean up the analysis and work with a larger range of dates so that we can get a better sense of the iOS data
Cleaning up the analysis and working with the full range of dates, we get a lot more iOS users
It seems pretty clear that the vast majority of users have good trajectories but there are a couple (at least two) that do significantly worse. I don't see any correlation with OS version at least at this point. Some of this may be due to the geofence exit delay, so let's retry after dropping the first point of every trip.
Note also that user_14
and user_3
are both android phones, so that explains this:
I also noticed that even for the good trajectories, the points were clustered together more closely off the freeway than on (aka slower speeds = closer points). Since iOS has a distance filter, this should not be the case, but I guess the phone was going fast enough that it overwhelmed the distance filter.
It took me essentially all of yesterday to strip out the first point because I (potentially foolishly) decided to not rely on the existing segmentation output and go directly to the transitions. But I finally have it working, and the results are fairly similar.
There is one phone who seems to have non-fine-grained trips, but the majority of other phones are doing fairly well. There is more variability in the time between points, as might be expected from an OS with a distance filter - the time delta will depend on the speed.
Digging a little deeper into that phone...
Digging into phone 6e
, we see that there is significant variability across trips. There are some trips for which the trajectory data is really bad, and some for which it is better. Note that the boxplots have removed outliers because otherwise we can't see anything
Description of the median inter-point distance across trips
count 59.000000
mean 56.877464
std 99.629760
min 0.000000
25% 3.946395
50% 15.598720
75% 70.115053
max 542.875925
Again, it seems like there are some trips where the median distance between points is close to what we expect, but some where it is much larger. Let's plot a few and see if any of them show the distinctive stutter of points that PSU is seeing. And then I will have to move on to other work.
Plotting some of the trajectories as basic lines, we see that some of them do have gaps but I don't see the stutter (lots of points and then a gap) that PSU is seeing. The 90th percentile median one in particular is clearly with evenly spaced points. I do think that the points are coming in at more than 1 meter, but they seem to be fairly consistent at that distance.
Trajectories of trips whose medians are at various quantiles for the overall trip medians (all distances in meters)
Median of medians (median: 14.99 m) | 75% (69.40m) | 90% (136.94m) |
---|---|---|
For the record, here's are the same metrics for the PSU phone
Trajectory | boxplot of distance without outliers |
---|---|
To summarize, working from a larger dataset of iPhones, we see that:
We can see this in the time boxplots - the median on the PSU phone is low, pulled down by the "bunches of points", while the median on the sample phone is high, representing a constant stream of widely spaced points.
PSU phone | sample phone on the 75% trip, which shows widely spaced points |
---|---|
Summary of the problem:
Given this, I suggest the following next steps:
We consider sending over a known good phone to PSU so we can verify that the issue is with the phone and not the GPS signals in Portland
For a moment we were considering sending my iPhoneSE (2nd gen) so I moved to a new opcode to verify tracking, thinking the fact I hadn't seen trips in a while was due to the pipeline being stuck, but when I got a notification yesterday that "trip tracking may not work".
I'll try uninstalling and reinstalling the app to see if that gets tracking working - it's worth noting that I tracked a hike with Strava this weekend and had no issues, though I have no idea how OpenPATH tracking compares.
I'll try uninstalling and reinstalling the app to see if that gets tracking working
For the first time in a while I have a trip record, but it is only the last 1/2 mile of my 3 mile commute home. Happy to provide logs/dump/screenshots on internal repo.
Report from PSU, who has been testing data collection
They took this trip (from roughly 205pm to 330pm)
and it was recorded as
Public issue created with permission (email dated Oct 24, with title "OpenPath questions about gaps in trips")