ideafast / middleware-services

Python API containing endpoints for smartphone hub applications and transfer to data portal
0 stars 0 forks source link

Enhance logging by catching expected `ERROR`'s #70

Open davidverweij opened 3 years ago

davidverweij commented 3 years ago

In running the BTF and DRM pipelines, a few 'patient_ids' re-occur each run as they have been used for testing. It would be beneficial to have the pipeline skip these instances to free some resources, time as well as remove these from any aggregations, statistics and process overviews.

I'd suggest to add another .csv file to the local folder: just one column containing all the known test accounts. Then, before the patient_id is being resolved, we can add a check that skips when this is the case.

This allows us to manually add values to the .csv when we detect (primarily through logs) new test accounts.

NOTE: this is a low priority enhancement.

davidverweij commented 3 years ago

In addition to known test accounts, device are occasionally generating data outside of known wear periods. For example, a device might be tested for 5 minutes a few days before handing to a patients.

We could think about adjusting the inventory history endpoint to return a certain status when no history could be found and the time period is way before it was ever deployed. However, this does mean we make assumptions based on inventory records. Ideally, these checks are not needed if the UCAM api is accurate and up-to-date.

I suggest we do not take action on this particular case. Instead, we can instruct future clinicians in the COS to always check out a device, even for testing, and using test accounts (with "TEST" in the name) to do so. Then, as @jawrainey suggested on Teams, we can catch any test usage from our inventory and exclude these metrics (or add a 'test' amount) to our pipeline stats.

davidverweij commented 3 years ago

With running the pipeline on DEV and LIVE and resolving the unknowns / errors, there are a few expected errors which will improve logging and statistics.

This includes the above mentioned accounts that include TEST, but also device specific records. In particular, there can be a number of recordings for ByteFlies that are less than a minute long - and often in between deployments. In addition, recordings of this length are arguably not useful. I'd therefore suggest we consider recordings with known deviceIDs, but unknown users with a length shorter than ~5 minutes to be expected unknowns.