medic / cht-core

The CHT Core Framework makes it faster to build responsive, offline-first digital health apps that equip health workers to provide better care in their communities. It is a central resource of the Community Health Toolkit.
https://communityhealthtoolkit.org
GNU Affero General Public License v3.0
440 stars 211 forks source link

Add offline clock error detection to telemetry data #6300

Closed ecsalomon closed 3 years ago

ecsalomon commented 4 years ago

Is your feature request related to a problem? Please describe. An analysis comparing phone clock errors collected via telemetry data and those that can be detected from differences in reported and replication dates (see https://github.com/medic/medic-impact/issues/185) suggests that the telemetry data under estimates the prevalence of phone clock errors. This could happen if a phone comes online and its clock updates before the CHT is opened.

Describe the solution you'd like Similar to the issue to detect time traveling phones offline and prompt the user to correct the time settings (https://github.com/medic/cht-core/issues/6284), it would be useful to collect telemetry data about potential time drift using methods other than relying on a server timestamp. For example, we could collect information about the average time since most recent sync for the reported dates.

Describe alternatives you've considered None

Additional context None

MaxDiz commented 4 years ago

@ecsalomon this issue is part of an epic that has been scheduled in 3.12. Can you let me know what priority it is for R&L work in the event we need to bump issues at the end of the year

mrjones-plip commented 3 years ago

@marialma or @helizabetholsen - is this still a priority for R&L such that we should try and ensure it stays in 3.12?

FYI @michaelkohn

michaelkohn commented 3 years ago

Please allocate any time spent on this to Project | 214 Research in Clicktime.

helizabetholsen commented 3 years ago

Based on research done by the R&L team, captured in #185 and #6284, it sounds like a preferred approach would be to check for a phone's time at the point of installation before any documents have been written to ensure the time is correct.

This is still a priority, particularly as this related to our data science and predictive analytics portfolio of work.

helizabetholsen commented 3 years ago

Could we add red flags for an incorrect time? @mrjones-plip had some excellent thoughts about this on our call, ie. outbounds for a timestamp on docs based on the initial sync.

mrjones-plip commented 3 years ago

@helizabetholsen - I was chatting with @garethbowen on this earlier today and he reminded me this ticket is about taking existing heuristics we're already tracking and ensuring as much of that as possible is being sent back in telemetry.

Novel additions to time drift (like I mentioned on our call last week) are already being done (track a hard coded, known good, last valid time from the server) or should be added in another ticket.

@latin-panda - please push back on PO team if this isn't actionable enough based off a more technical take on anything else we could add to telemetry re. time drift per Erika's comments in the top body.

dianabarsan commented 3 years ago

We are adding daily telemetry reporting: https://github.com/medic/cht-core/issues/6915#issuecomment-831470418 . Would the already existent telemetry record, 'client-date-offset', when recorded daily, be sufficient for this ticket?

The disadvantage to keep using this method is that it requires an active internet connection at device startup. But I think the "solution" for this problem is already covered by https://github.com/medic/cht-core/issues/6284 .

ecsalomon commented 3 years ago

IIRC, this issue was opened because client-date-offset catches almost none of the clock setting errors (I think there were 3 examples of a non-0 client-date-offset across 2 projects); the goal of this ticket is to identify other potential proxies for time drift and collect those as part of telemetry.

dianabarsan commented 3 years ago

How about we add a "retry" mechanism to getting the client-date-offset? One big problem about it is that we only try once, soon after the app boots. We could change this, so if connection on the 1st try is unsuccessful, we keep retying, maybe with every "sync", until we get a successful request through and can register the client-date-offset value.

ecsalomon commented 3 years ago

I don't know much about how phones work, so I am not sure if this is a reasonable question, but if the phone comes online and is set to automatically sync the clock, will the retry increase the probability of catching a client-date-offset?

garethbowen commented 3 years ago

maybe with every "sync"

+1 to this. Good call.

dianabarsan commented 3 years ago

This is ready for AT on 6300-client-date-offset-on-sync.

As described above, the webapp will retry checking the local date against the server until it gets a response. 1st try will be at startup, just like before - this is also enabled to show the popup - while the next retries happen when the device syncs - these won't show the popup even if the clock is off.

ngaruko commented 3 years ago

Tested on local and adjusting clients and server time times. No pop-ups on subsequent change of times/syncs. client-date-offset recorded in telemetry.

 "client-date-offset": {
      "sum": 4147201115,
      "min": 2073600536,
      "max": 2073600579,
      "count": 2,
      "sumsqr": 8599638544128623000
    },
dianabarsan commented 3 years ago

Merged to master.