Open shankari opened 5 days ago
Here's what I see in the logs:
Previous trip ended at 8:21
797,1727018467.76015,2024-09-22T08:21:07.760150-07:00,"In TripDiaryStateMachine, received transition T_DATA_PUSHED in state STATE_WAITING_FOR_TRIP_START"
New trip started at 13:50, including a VISIT_ENDED
833,1727038244.95052,2024-09-22T13:50:44.950520-07:00,"In TripDiaryStateMachine, received transition T_EXITED_GEOFENCE in state STATE_WAITING_FOR_TRIP_START"
836,1727038244.95445,2024-09-22T13:50:44.954450-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_WAITING_FOR_TRIP_START"
844,1727038244.96454,2024-09-22T13:50:44.964540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_ONGOING_TRIP"
847,1727038244.96749,2024-09-22T13:50:44.967490-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
856,1727038245.60322,2024-09-22T13:50:45.603220-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1132,1727038252.92264,2024-09-22T13:50:52.922640-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1162,1727038267.53891,2024-09-22T13:51:07.538910-07:00,"In TripDiaryStateMachine, received transition T_VISIT_ENDED in state STATE_ONGOING_TRIP"
Received some more (null) transitions and restarts
1272,1727038591.74358,2024-09-22T13:56:31.743580-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1431,1727038599.1122,2024-09-22T13:56:39.112200-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1620,1727038892.42846,2024-09-22T14:01:32.428460-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1898,1727038899.855,2024-09-22T14:01:39.855000-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1973,1727039193.41537,2024-09-22T14:06:33.415370-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
2251,1727039200.84957,2024-09-22T14:06:40.849570-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
Until we received a VISIT_STARTED and ended the trip
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
2323,1727039501.05468,2024-09-22T14:11:41.054680-07:00,"In TripDiaryStateMachine, received transition T_TRIP_END_DETECTED in state STATE_ONGOING_TRIP"
2346,1727039502.19317,2024-09-22T14:11:42.193170-07:00,"In TripDiaryStateMachine, received transition T_END_TRIP_TRACKING in state STATE_ONGOING_TRIP"
2348,1727039502.27354,2024-09-22T14:11:42.273540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_ENDED in state STATE_ONGOING_TRIP"
The next exit was a substantial time away so is not part of this fragmented trip.
8401,1727049760.00312,2024-09-22T17:02:40.003120-07:00,"In TripDiaryStateMachine, received transition T_EXITED_GEOFENCE in state STATE_WAITING_FOR_TRIP_START"
There are 5 clusters of points in the map, and 5 trip segments in the logs. So the first order of business is to look at why we are ending the trips so many times.
@JGreenlee for visibility
We are ending trips because of the VISIT transitions
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
2323,1727039501.05468,2024-09-22T14:11:41.054680-07:00,"In TripDiaryStateMachine, received transition T_TRIP_END_DETECTED in state STATE_ONGOING_TRIP"
2346,1727039502.19317,2024-09-22T14:11:42.193170-07:00,"In TripDiaryStateMachine, received transition T_END_TRIP_TRACKING in state STATE_ONGOING_TRIP"
2348,1727039502.27354,2024-09-22T14:11:42.273540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_ENDED in state STATE_ONGOING_TRIP"
Looking at the logs around that
2316,1727039501.04798,2024-09-22T14:11:41.047980-07:00,"Received visit notification = <+45.51667792,-122.65693038> +/- 59.58m (2024-09-22 21:06:47 +0000 to -) (null)"
2319,1727039501.05051,2024-09-22T14:11:41.050510-07:00,"departure date is 4001-01-01 00:00:00 +0000, isDistantDate? 1, after distantDate? 0 "
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
2321,1727039501.05311,2024-09-22T14:11:41.053110-07:00,Checking invalid visit started transition...
2322,1727039501.05415,2024-09-22T14:11:41.054150-07:00,"visit started transition is not invalid, returning false"
We do check for invalid visit started transitions here: https://github.com/e-mission/e-mission-data-collection/blob/06a90be0db66402b13bd6c01e5676936857c2f04/src/ios/Location/TripDiaryStateMachine.m#L600
However, in this case, neither check was successful. Because of the restarts, there were multiple transitions. And the last two transitions were
1973,1727039193.41537,2024-09-22T14:06:33.415370-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
2251,1727039200.84957,2024-09-22T14:06:40.849570-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
So we didn't even check for the deltaTs.
And even if we had found the correct transitions and checked for the deltaTs, it would have been more than a few minutes. This really does not seem like a spurious visit transition - it was sufficiently after the trip start that it seems meaningful.
1162,1727038267.53891,2024-09-22T13:51:07.538910-07:00,"In TripDiaryStateMachine, received transition T_VISIT_ENDED in state STATE_ONGOING_TRIP"
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
The visit API is a black box, so we are not able to look into it or understand why it is generating these spurious visit transitions. All we can try to do at this point is to add additional checks that detect that the visit transition was invalid and ignore it.
Since the transition here was several minutes after the trip start, the only possible check would be that the user was still moving - i.e. that we were receiving location updates that represented movement consistently. If so, we can ignore the VISIT_STARTED notifications.
Looking at the location logs, it looks like this won't really work. The last log statement before the VISIT_STARTED
transition was from 14:07
2300,1727039233.09842,2024-09-22T14:07:13.098420-07:00,"Recieved location <+45.51682256,-122.65781566> +/- 2.53m (speed 4.99 mps / course 180.06) @ 9/22/24, 2:07:13 PM Pacific Daylight Time, ongoing trip = true"
2320,1727039501.05106,2024-09-22T14:11:41.051060-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
Verified using uploaded data
Let's see what is happening between 14:07 and 14:11 in the logs. Nothing.
We get the last location input that we see, and then we don't see any other logs until we initialize the plugins as part of restarting the app, just before we receive the VISIT transition.
2300,1727039233.09842,2024-09-22T14:07:13.098420-07:00,"Recieved location <+45.51682256,-122.65781566> +/- 2.53m (speed 4.99 mps / course 180.06) @ 9/22/24, 2:07:13 PM Pacific Daylight Time, ongoing trip = true"
2301,1727039500.42191,2024-09-22T14:11:40.421910-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler
2302,1727039500.42901,2024-09-22T14:11:40.429010-07:00,"intro_done result = {
""intro_done"" = ""2024-08-28T09:57:43.879-07:00"";
}"
2303,1727039500.43044,2024-09-22T14:11:40.430440-07:00,"Auth found in local storage, now it should be stable"
2304,1727039500.43115,2024-09-22T14:11:40.431150-07:00,finished init of iOS native code
2311,1727039500.99902,2024-09-22T14:11:40.999020-07:00,started fine location tracking with accuracy = -2 and distanceFilter = 1
With these logs, and the iOS distance filter, it is not 100% clear whether:
Given that we see the pluginInitialize
messages, (2) seems more likely.
But it would be good to get some indication from the user whether they actually did stop moving at that time. Because, we do try to detect when the app has been terminated, and I don't see that log either. https://github.com/e-mission/e-mission-data-collection/blob/06a90be0db66402b13bd6c01e5676936857c2f04/src/ios/BEMAppDelegate.m#L103
Poking around a bit more at the termination, note (from https://developer.apple.com/documentation/corelocation/cllocationmanager/startupdatinglocation()) that
If you start this service and your app is suspended, the system stops the delivery of events until your app starts running again (either in the foreground or background). If your app is terminated, the delivery of new location events stops altogether. Therefore, if your app needs to receive location events while in the background, it must include the UIBackgroundModes key (with the location value) in its Info.plist file.
We do have the UIBackgroundModes
key set https://github.com/e-mission/e-mission-data-collection/blob/06a90be0db66402b13bd6c01e5676936857c2f04/plugin.xml#L206 so the app should be running in the background.
But while the background modes protect the app from being terminated because it went off the foreground, it clearly does not prevent it from being terminated due to resource constraints.
Looking at documentation around termination: https://developer.apple.com/documentation/xcode/reduce-terminations-in-your-app it looks like we can view termination reasons in Xcode, or use the new MetricKit (available in iOS 13+).
We now require a minimum of iOS 13+ so that is definitely an option.
Checked the organizer - it does not have enough metrics. Which seems to argue against the reason being termination if this is consistent behavior for Nathan and Joe.
Note also that, although we noticed that there was a restart around the time we had no locations, we also had multiple restarts that were not correlated with lack of locations.
Focusing on the transitions:
844,1727038244.96454,2024-09-22T13:50:44.964540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_ONGOING_TRIP"
847,1727038244.96749,2024-09-22T13:50:44.967490-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
856,1727038245.60322,2024-09-22T13:50:45.603220-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1132,1727038252.92264,2024-09-22T13:50:52.922640-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1162,1727038267.53891,2024-09-22T13:51:07.538910-07:00,"In TripDiaryStateMachine, received transition T_VISIT_ENDED in state STATE_ONGOING_TRIP"
1272,1727038591.74358,2024-09-22T13:56:31.743580-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1431,1727038599.1122,2024-09-22T13:56:39.112200-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
1620,1727038892.42846,2024-09-22T14:01:32.428460-07:00,"In TripDiaryStateMachine, received transition (null) in state STATE_ONGOING_TRIP"
1898,1727038899.855,2024-09-22T14:01:39.855000-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
Checking the location logs around those, we see that there seem to be pauses for ~ 5 minutes, followed by reinitializing, and then seeing locations again
1244,1727038321.09963,2024-09-22T13:52:01.099630-07:00,"Recieved location <+45.49754485,-122.60493378> +/- 2.14m (speed 14.43 mps / course 270.27) @ 9/22/24, 1:52:00 PM Pacific Daylight Time, ongoing trip = true"
1245,1727038590.47854,2024-09-22T13:56:30.478540-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler
1246,1727038590.48451,2024-09-22T13:56:30.484510-07:00,"intro_done result = {
""intro_done"" = ""2024-08-28T09:57:43.879-07:00"";
}"
1247,1727038590.48536,2024-09-22T13:56:30.485360-07:00,"Auth found in local storage, now it should be stable"
1265,1727038591.13447,2024-09-22T13:56:31.134470-07:00,"Recieved location <+45.49730524,-122.62361088> +/- 2.01m (speed 14.08 mps / course 263.37) @ 9/22/24, 1:56:30 PM Pacific Daylight Time, ongoing trip = true"
1267,1727038591.64683,2024-09-22T13:56:31.646830-07:00,"Recieved location <+45.49729159,-122.62379686> +/- 2.25m (speed 14.58 mps / course 263.05) @ 9/22/24, 1:56:31 PM Pacific Daylight Time, ongoing trip = true"
--------
1594,1727038631.09737,2024-09-22T13:57:11.097370-07:00,"Recieved location <+45.49683290,-122.63151200> +/- 2.00m (speed 14.95 mps / course 271.40) @ 9/22/24, 1:57:11 PM Pacific Daylight Time, ongoing trip = true"
1595,1727038891.24254,2024-09-22T14:01:31.242540-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler
1605,1727038891.79792,2024-09-22T14:01:31.797920-07:00,started fine location tracking with accuracy = -2 and distanceFilter = 1
1615,1727038891.83103,2024-09-22T14:01:31.831030-07:00,"Recieved location <+45.50487549,-122.64892057> +/- 2.01m (speed 10.75 mps / course 270.39) @ 9/22/24, 2:01:31 PM Pacific Daylight Time, ongoing trip = true"
1617,1727038892.34075,2024-09-22T14:01:32.340750-07:00,"Recieved location <+45.50487412,-122.64905817> +/- 2.00m (speed 10.73 mps / course 270.39) @ 9/22/24, 2:01:32 PM Pacific Daylight Time, ongoing trip = true"
Next steps:
kCLDistanceFilterNone
so that we can experiment with no distance filter to collect data even when the user is not moving, although it may result in higher battery drain
https://developer.apple.com/documentation/corelocation/kcldistancefilternone?language=objcOne other observation from the logs before I forget.
I noticed that there were several UI logs that were overlapping with the start of this trip
e.g.
833,1727038244.95052,2024-09-22T13:50:44.950520-07:00,"In TripDiaryStateMachine, received transition T_EXITED_GEOFENCE in state STATE_WAITING_FOR_TRIP_START"
836,1727038244.95445,2024-09-22T13:50:44.954450-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_WAITING_FOR_TRIP_START"
844,1727038244.96454,2024-09-22T13:50:44.964540-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_ONGOING_TRIP"
847,1727038244.96749,2024-09-22T13:50:44.967490-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
but then
1372,1727038592.89993,2024-09-22T13:56:32.899930-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
1385,1727038594.04323,2024-09-22T13:56:34.043230-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
...
The UI load seems to have started at around
929,1727038246.42178,2024-09-22T13:50:46.421780-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//pipeline/get_range_ts
937,1727038246.43513,2024-09-22T13:50:46.435130-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//profile/update
948,1727038246.4863,2024-09-22T13:50:46.486300-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//profile/update
955,1727038246.96463,2024-09-22T13:50:46.964630-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
...
1007,1727038248.78387,2024-09-22T13:50:48.783870-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
1330,1727038592.50981,2024-09-22T13:56:32.509810-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//pipeline/get_range_ts
1354,1727038592.54016,2024-09-22T13:56:32.540160-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//profile/update
1367,1727038592.88539,2024-09-22T13:56:32.885390-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
...
1419,1727038594.70805,2024-09-22T13:56:34.708050-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//datastreams/find_entries/timestamp
1679,1727038892.86086,2024-09-22T14:01:32.860860-07:00,postToHost called with url = https://open-access-openpath.nrel.gov/api//pipeline/get_range_ts
and if we look at the periods where the app appeared to restart, we were still loading data when the app appeared to be terminated.
1240,1727038318.74052,2024-09-22T13:51:58.740520-07:00,"fetchNominatimLocName: found cached response for -122.60380015564107,45.503289315482654"
1241,1727038319.10062,2024-09-22T13:51:59.100620-07:00,"Recieved location <+45.49754129,-122.60456163> +/- 2.01m (speed 14.13 mps / course 270.00) @ 9/22/24, 1:51:58 PM Pacific Daylight Time, ongoing trip = true"
1242,1727038320.0978,2024-09-22T13:52:00.097800-07:00,"Recieved location <+45.49753932,-122.60475056> +/- 2.00m (speed 14.58 mps / course 270.27) @ 9/22/24, 1:51:59 PM Pacific Daylight Time, ongoing trip = true"
1243,1727038320.74276,2024-09-22T13:52:00.742760-07:00,"fetchNominatimLocName: found cached response for -122.68062146450143,45.510185801655595"
1244,1727038321.09963,2024-09-22T13:52:01.099630-07:00,"Recieved location <+45.49754485,-122.60493378> +/- 2.14m (speed 14.43 mps / course 270.27) @ 9/22/24, 1:52:00 PM Pacific Daylight Time, ongoing trip = true"
1245,1727038590.47854,2024-09-22T13:56:30.478540-07:00,BEMJWTAuth:pluginInitialize singleton -> initialize completion handler
@JGreenlee I wonder if this is due to the fact that we load data on every resume now.
If we are getting terminated due to resource constraints, the problem may be:
I would guess that this doesn't trigger for the test phone because it is not running anything else (so the resources are not constrained) and there aren't a lot of recent trips anyway since I don't carry the test phone everywhere.
It seems like a general optimization would be to not load the diary if the app has been launched in the background since nobody is watching anyway. Need to see if the suspend/resume/launch notifications let us determine whether the launch was in the foreground or the background.
I am surprised that the webview is even able to run while the app is not in the foreground. (I am guessing this iOS-specific because I don't think it would be possible on Android.)
It seems like a general optimization would be to not load the diary if the app has been launched in the background since nobody is watching anyway.
We certainly do not want to be loading the diary when in the background – especially because when the app is resumed from the background, it will all be reloaded again anyway.
Need to see if the suspend/resume/launch notifications let us determine whether the launch was in the foreground or the background.
Yes it is possible via React Native's AppState API https://reactnative.dev/docs/0.71/appstate
Currently, we use this to determine when the app is resumed from the background (state transitions to active
).
However, we don't check AppState
on a cold launch.
I am surprised that the webview is even able to run while the app is not in the foreground. (I am guessing this iOS-specific because I don't think it would be possible on Android.)
Yes, on iOS, the webview keeps running even when the app is in the background. For the trip-end notifications, this was actually helpful, here it is annoying 😄
wrt (1) from https://github.com/e-mission/e-mission-docs/issues/1095#issuecomment-2435841733
PSU also reported this trip (to a concert at around 5pm, returning at around 9:15pm). This was a freeway car trip, but still has clusters of location points.
This seems to argue for the app termination as the cause of the issue. Let's see if this is borne out by the logs.
Two main takeaways from that deep dive:
17:37:19
and 18:20:00
without a VISIT transitionTesting out the change, I see
23258,1729999919.70029,2024-10-26T20:31:59.700290-07:00,"In TripDiaryStateMachine, received transition T_EXITED_GEOFENCE in state STATE_WAITING_FOR_TRIP_START"
23269,1729999919.9377,2024-10-26T20:31:59.937700-07:00,"In TripDiaryStateMachine, received transition T_TRIP_STARTED in state STATE_ONGOING_TRIP"
23270,1729999920.02693,2024-10-26T20:32:00.026930-07:00,Current state of region STATIONARY_GEOFENCE_LOCATION is 2 (outside)
23271,1729999920.0396,2024-10-26T20:32:00.039600-07:00,Received OUTSIDE geofence state when currState = STATE_ONGOING_TRIP
23272,1729999920.05051,2024-10-26T20:32:00.050510-07:00,"In TripDiaryStateMachine, received transition T_TRIP_RESTARTED in state STATE_ONGOING_TRIP"
23798,1730000557.73419,2024-10-26T20:42:37.734190-07:00,"In TripDiaryStateMachine, received transition T_VISIT_STARTED in state STATE_ONGOING_TRIP"
23801,1730000558.04736,2024-10-26T20:42:38.047360-07:00,"In TripDiaryStateMachine, received transition T_TRIP_END_DETECTED in state STATE_ONGOING_TRIP"
23814,1730000559.35952,2024-10-26T20:42:39.359520-07:00,"In TripDiaryStateMachine, received transition T_END_TRIP_TRACKING in state STATE_ONGOING_TRIP"
23817,1730000559.45782,2024-10-26T20:42:39.457820-07:00,Moved from STATE_ONGOING_TRIP to STATE_WAITING_FOR_TRIP_START
23859,1730000868.93481,2024-10-26T20:47:48.934810-07:00,Application will enter the foreground
23860,1730000869.03664,2024-10-26T20:47:49.036640-07:00,"addStatReading: adding CLIENT_TIME event: {""name"":""app_state_change"",""ts"":1730000868.919,""reading"":""active"",""client_app_version"":""1.9.4"",""client_os_version"":""15.7.5""}"
23861,1730000869.05134,2024-10-26T20:47:49.051340-07:00,"addStatReading: adding CLIENT_TIME event: {""name"":""app_state_change"",""ts"":1730000868.925,""reading"":""active"",""client_app_version"":""1.9.4"",""client_os_version"":""15.7.5""}"
egrep "2024-10-26.*(app_state_change|pluginInitialize|T_TRIP_STARTED|T_TRIP_END_DETECTED)" /tmp/loggerDB.txt.withdate.log
) across the day shows a very similar pattern Looks promising; let's push it out to production and see if our partners at PSU are able to reproduce the original issue.
Report from PSU, who has been testing data collection
They took this trip (from roughly 205pm to 330pm)
and it was recorded as
Public issue created with permission (email dated Oct 24, with title "OpenPath questions about gaps in trips")