Migrating trip segmentation feature to native Android project

xubowenhaoren commented 5 years ago

Hi, this is a request/ suggestion for the modularization of trip segmentation feature. E-mission is an excellent trip data collection solution by itself. However, researchers already working with other general data collection frameworks (AWARE, for instance) cannot easily integrate the e-mission functionalities. Therefore, we're requesting the trip segmentation feature to be modularized. Here we are listing some goals, where each goal is a "phase" that provides increasingly more modularization that we'd like to see.

Make the trip plugin available as a Maven repo. Core functionalities like recording movement, generating survey notifications, and local database storage should ideally work "out of the box".
- Relevant modules: e-mission-data-collection, e-mission-transition-notify
Provide a guide for creating a simple UI that lists previous trips and provides the "take survey" button for the participant to take the surveys.
- Relevant modules: e-mission/e-mission-phone: www/templates/diary, js/diary/
Handle database syncing:
- Expose the local database so e-mission data can be synced by the parent data collection framework
  - More work, but a more elegant solution. We understand that currently some final pipelines must be run on the remote server for the trip segmentation to work. How to address this is work TBD.
- Or create a lightweight e-mission-server that supports database syncing with a simple identifier like device_id. Remove other components like connectionConfig, auth, etc.
  - Final analysis pipelines can still be run on this lightweight e-mission-server. Minimal modification required for e-mission-data-collection.
Expose emission's location and activity recognition providers to the parent data collection framework to save battery life.
- Currently both emission and AWARE are making location requests at the same time but separately, and this has been shown to impact some users' battery life.
- Relevant modules: e-mission-data-collection

shankari commented 4 years ago

@xubowenhaoren waiting for your update here! 😃

xubowenhaoren commented 4 years ago

Hello,

Sorry for the late reply. I was trying to confirm that emission-Aware is working side-by-side with emission-original.

I believe that the French team's PR is useful within a Cordova project of emission, so their work won't be applicable in our native Android project.

With that said, the manual work required to refractor the BroadcastReceivers isn't too much trouble, either. Below you can find an example of our current AndroidManifest.xml, where we refactored all emission BroadcastReceivers by appending a "2":

    <receiver android:enabled="true" android:name="com.aware.plugin.cmu.sup.DataCollection.BootReceiver2">
        <intent-filter>
            <action android:name="android.intent.action.BOOT_COMPLETED" />
        </intent-filter>
    </receiver>
    <receiver android:enabled="true" android:name="com.aware.plugin.cmu.sup.DataCollection.location.TripDiaryStateMachineReceiver2">
        <intent-filter>
            <action android:name="local.transition.initialize" />
            <action android:name="local.transition.exited_geofence" />
            <action android:name="local.transition.stopped_moving" />
            <action android:name="local.transition.stop_tracking" />
            <action android:name="local.transition.start_tracking" />
            <action android:name="local.transition.tracking_error" />
        </intent-filter>
    </receiver>
    <service android:enabled="true" android:exported="false" android:name="com.aware.plugin.cmu.sup.DataCollection.location.ActivityRecognitionChangeIntentService2" />
    <service android:enabled="true" android:exported="false" android:name="com.aware.plugin.cmu.sup.DataCollection.location.TripDiaryStateMachineService2" />
    <service android:enabled="true" android:exported="false" android:name="com.aware.plugin.cmu.sup.DataCollection.location.TripDiaryStateMachineServiceOngoing2" />
    <service android:enabled="true" android:exported="false" android:name="com.aware.plugin.cmu.sup.DataCollection.location.TripDiaryStateMachineForegroundService2" />
    <service android:enabled="true" android:exported="false" android:name="com.aware.plugin.cmu.sup.DataCollection.location.GeofenceExitIntentService2" />
    <service android:enabled="true" android:exported="false" android:name="com.aware.plugin.cmu.sup.DataCollection.location.LocationChangeIntentService2" />
    <service android:enabled="true" android:exported="false" android:name="com.aware.plugin.cmu.sup.DataCollection.location.actions.GeofenceLocationIntentService2" />
    <receiver android:enabled="true" android:name="com.aware.plugin.cmu.sup.TransitionNotify.TransitionNotificationReceiver2">
        <intent-filter>
            <action android:name="local.transition.initialize" />
            <action android:name="local.transition.exited_geofence" />
            <action android:name="local.transition.stopped_moving" />
            <action android:name="local.transition.stop_tracking" />
            <action android:name="local.transition.start_tracking" />
        </intent-filter>
    </receiver>

In our testing, I found that I had to completly uninstall emission-original. So I install emission-Aware first, make sure it's up and running, and then install emission-original back. You can see both apps run in parallel when I'm in the middle of a trip. Testing device is a OnePlus 6 with Android 10.

xubowenhaoren commented 4 years ago

Hello,

In our latest communication, we discovered that different Google Play versions in the app.gradle can impact the internal data stored in the userCacheDB.

In Aware-emission, the implementations are

    implementation 'com.google.firebase:firebase-core:16.0.7'
    implementation 'com.google.firebase:firebase-crash:16.2.1'
    implementation 'com.google.firebase:firebase-messaging:17.3.4'
    implementation 'androidx.constraintlayout:constraintlayout:1.1.2'
    implementation "androidx.appcompat:appcompat:1.0.0"
    implementation 'androidx.legacy:legacy-support-v4:1.0.0'
    implementation "androidx.media:media:1.0.0"
    implementation "androidx.fragment:fragment:1.0.0"
    api 'com.google.android.gms:play-services-location:17.0.0'
    implementation 'com.google.code.gson:gson:2.8.5'

and one example background/motion_activity record is {"zzbhB":4,"zzbhC":40}.

In the latest opentoall branch of emission-original, the implementations are

    compile "com.android.support:support-v13:26.+"
    compile "me.leolin:ShortcutBadger:1.1.17@aar"
    compile "com.google.firebase:firebase-messaging:11.0.1"
    compile "com.android.support:support-v4:24.1.1+"
    compile "com.android.support:support-v4:+"
    compile "com.google.android.gms:play-services-auth:11.0.1"
    compile "net.openid:appauth:0.7.0"
    compile "com.auth0.android:jwtdecode:1.1.1"
    compile "com.google.code.gson:gson:+"
    compile "com.google.android.gms:play-services-location:11.0.1"

and one example background/motion_activity record is {"zzi":2,"zzs":95}.

Our questions:

How would emission-original parse background/motion_activity records and reconstruct trip details, given no human-readable background/motion_activity headers?
Would the change in Google Play versions, and the change in background/motion_activity headers affect the data parsing? If so, how?

shankari commented 4 years ago

How would emission-original parse background/motion_activity records and reconstruct trip details, given no human-readable background/motion_activity headers?

Multiple points about background/motion_activity records

they are human readable - they are text, not binary.
they are not used for trip reconstruction, only for section reconstruction
we do not perform section reconstruction for draft trips, which is why they always show as UNKNOWN
section reconstruction happens on the server as part of the pipeline. Please see the pipeline chapter of the thesis for details
the change in Google Play versions is handled as part of the incoming formatting layer (https://github.com/e-mission/e-mission-server/blob/master/emission/net/usercache/formatters/android/motion_activity.py)

There is an open issue to convert the android motion activity entries to a standard format https://github.com/e-mission/e-mission-data-collection/issues/80 which you are welcome to submit a fix for if you like.

However, the workaround of reformatting on the server works fine for now so it is not a high priority fix for the overall e-mission platform.

xubowenhaoren commented 4 years ago

Hello,

In order to simplify the data syncing with Aware servers and the UI work, we have come up with a preliminary plan to parse the userCacheDB into trip diaries. Although much of the work here is design-specific, we still want to share it and welcome comments.

Build a secondary database called trips that fits the Aware database design. Each row documents one whole trip, including

device_id as user id,
timestamp as the start time,
end_time as the end time,
start_location as the lat & long of the origin,
end_location as the lat & long of the destination,
sections as the section(s) during the trip. Each section should contain individual lat & long lists as in Emission design.

This trips DB will be automatically synced and cleaned (meaning that data synced and stored from a week ago will be removed) by the Aware framework.

After the completion of each trip, the existing emission BroadcastReceivers will send a broadcast. We can utilize this to start a database daemon service that tails the emission userCacheDB, parse the only trip inside, and stores the parsed results as one row of data in the trips database. All parsed data in userCacheDB are removed immediately. In this way, we could avoid tailing long, continued records in userCacheDB and make mistakes in between.
Now, the UI only needs to read each row in the trips DB and display the trip information accordingly. No need to parse userCacheDB every time.

shankari commented 4 years ago

So effectively, you want to re-write the trip and section segmentation algorithms in java to run on the phone? Because right now, for easy extensibility and to compare implementations to pick the best one, the algorithms are written in python and run on the server.

In particular, if you store processed data (e.g. trip outputs), then if somebody (e.g. @jf87) contributes an improved section + mode detection algorithm later, you cannot just reset the pipeline and re-run it to get the improvements, because the raw data is gone. This is why the reproducible pipeline is such a powerful concept.

xubowenhaoren commented 4 years ago

During the communication, we found that the best practice for the parsing of trips for cross-platform support (iOS, Android) is to still leave the processing on the server. So we will hold back the idea of the trips DB for the moment.

However, the syncing with Aware servers require a device_id and timestamps in milliseconds (which is different in emission). How could we make these two changes for emission userCacheDB on-the-fly to make it compatible for Aware syncing with minimal changes to the codebase?

shankari commented 4 years ago

You can do this in many ways:

you can change the user cache plugin to add a new field for the device id and then push it as usual
since aware does not read from the e-mission usercache DB, you can keep the e-mission usercacheDB unchanged, and add in the device ID to each row when reading it for the aware sync (which you have to write anyway).
you can add an adapter on the server side which will take e-mission data, convert secs -> ms, add a device id and store into the aware database. In this case, you would send the device id as one common field across all the data points.
you can run an e-mission server instance in parallel with aware, push the e-mission data to the e-mission server database, run the analysis scripts and then pull the import/export the analysed data to aware. Note that in this case, the aware data is also a read-only copy of the emission outputs, so if you purge the analysis outputs in e-mission and re-run the pipeline (e.g. to pick up a better implementation), you can also delete the entries in the aware database and re-import from e-mission.

shankari commented 4 years ago

If you go to any server - e.g. https://e-mission.eecs.berkeley.edu/#/home Click on Metrics at the top, you can search by time ranges and it should display the distance, duration etc in the aggregate.

Note that this does not work on the main server because the pipeline is not running. Note also that this may be somewhat bitrotted because nobody has actively used it since I built it. But it is effectively the same code as the phone metrics and that I know that works because the European teams are using it. At worst, you may need to port over the bug fixes from the phone metrics screen to the server metrics screen.

xubowenhaoren commented 4 years ago

Everytime when I visit https://e-mission.eecs.berkeley.edu/#/home on my laptop, this page is always empty. However the web page is working on my phone.

EDIT: disabling the adblocker works.

shankari commented 4 years ago

can you check for popup blockers? @asiripanich reported something similar

xubowenhaoren commented 4 years ago

Hello,

We have decided to keep the current server architecture for emission during the migration. In terms of syncing, we plan to do this approach:

since aware does not read from the e-mission usercache DB, you can keep the e-mission usercacheDB unchanged, and add in the device ID to each row when reading it for the aware sync (which you have to write anyway).

In this case, can we use the emission-server unchanged? Which of the versions are more debugging-friendly, the docker distro or the manual install?

shankari commented 4 years ago

Just to be clear, with this syncing approach, you will not be able to pull data from the e-mission server, or will you? is the plan that you will send the data one way to the e-mission server? If you want to pull the data, how will you authenticate to avoid hacking access?

xubowenhaoren commented 4 years ago

It appears that I have some misunderstanding about the design mod here.

We wish to make minimal changes to the emission server architecture without affecting existing functionalities. This includes two-way upload and downloading of the trip data.
In terms of the authentication, AWARE uses the device_id for simple verification. When a phone first joins a study, it uploads its randomly generated device_id to the AWARE database. Then, any data that's uploaded from the phone will have the device_id attached, so the AWARE database can check against its own devices table and decide to keep the data upload. Is there a similar design in emission?

So the ultimate question is, at the very end of the study, we want the emission-Aware data to follow Aware standards in that each row of the data has a device_id and timestamp in ms. However, where this change must occur is still unclear.

shankari commented 4 years ago

so the closest authentication mechanisms to the ones you have listed are "skip" and "token_list". "skip" is probably closer than "token_list". In "skip", you specify an arbitrary string while creating the profile and that string is your auth token going forward.

Note that this is not secure if the string is not randomly generated by the user or the app using a one-time generation key.

As a concrete example, any other app running on the user's phone also has access to their device ID. So if a hacker can convince a user to install their app, they can retrieve the device id and then use it to download the users entire travel data without asking for location permissions from the hacker's app.

If you want to use a skip-like mechanism, you can generate a one-time UUID in the app, not shared with anything else, and use that as the token. You can also, of course, map the UUID to the device id in a table on the server. But you cannot allow retrieval of data based only on the device id due to the potential security implications.

xubowenhaoren commented 4 years ago

device_id is generated by the AWARE client for every fresh install.

This means on the same phone, uninstall & reinstall will result in different device_id.
Updates to the app won't change the device_id.

We do identification via participant id or PID for short.

PID is a 3-digit code we assign to each participant before they get the app.
We associate PID with a device_id at every fresh install.

shankari commented 4 years ago

ok so with these clarifications, it looks like using the device_id with skip authentication will work. Note, of course, that if the user switches phones or reinstalls the app, the device ID will change, but presumably you have a mechanism in place already to handle that.

In this case, it may be easiest to just push the e-mission data to the e-mission server directly using the existing sync mechanism. You can then pull the data also using the device id for authentication. When you import from e-mission server to aware server, you can add the device id and convert the timestamp and whatever other formatting changes you would like to do.

You still need to make the changes to display the processed data in the native UI.

Based on my discussion with Anat, you do not want to make UI changes unless necessary. If you are pulling processed data from the server, you are not making native calls from javascript, so you could theoretically use a webview to pull and display the trips. But you would have to transfer the deviceid from the native code to the webview.

shankari commented 4 years ago

Which of the versions are more debugging-friendly, the docker distro or the manual install?

both are equally debugging friendly, but the manual install is more development friendly if you want to edit the code to add export functionality.

shankari commented 4 years ago

also, if you want to use the docker image, file an issue asking me to upload a new docker-compose with a cronjob example. I have it locally but I now only do work if requested because I am busy otherwise

xubowenhaoren commented 4 years ago

Hello,

After discussion with Anat, I've come up with 2 plans for the UI implementation:

Modify and embed the existing "Trip diary" UI. We keep this only and remove other UI's such as login, settings, etc.

Pro: This should be easier to get the dev going, given that there is an official guide here and that I've had experience embedding an entire Cordova app before.
Con: A part of the Cordova framework is required for the embed.
1. As in the earlier communication, Shankari recommended hosting the Trip UI on a remote server and access via a native WebView.

If you are pulling processed data from the server, you are not making native calls from javascript, so you could theoretically use a webview to pull and display the trips. But you would have to transfer the deviceid from the native code to the webview.

Pro: We obviously don't need to embed any Cordova framework; the dev work on the native app side will also be minimal.
Pro 2: We can change the look & feel on the server remotely.
Con: I am uncertain about how much Cordova native plugins, if any, does the trip UI require. Also, there are only unofficial guides online about hosting Cordova as web apps.
Con 2: Does not read local DB, so does not handle draft trips

xubowenhaoren commented 4 years ago

I also have questions about handling the emission data sync.

In this case, it may be easiest to just push the e-mission data to the e-mission server directly using the existing sync mechanism.

Since we are collecting AWARE and e-mission data separately, there's no need to write a new sync adapter under AWARE. Then, by "existing", do you mean "e-mission existing"? To be more specific, the adapters in this repo?

shankari commented 4 years ago

@xubowenhaoren before we dive too deeply into these answers, did you consider just using the cordova app with the "skip" notification, without using AWARE, given that you are no longer partnering with the other group? Note that using the cordova app will allow your solution to work on both android and iOS.

shankari commented 4 years ago

After discussion with Anat, I've come up with 2 plans for the UI implementation:

Both of these should work. Are you going to try them out, or do you have specific questions for me?

shankari commented 4 years ago

Then, by "existing", do you mean "e-mission existing"?

Yes.

Then, by "existing", do you mean "e-mission existing"? To be more specific, the adapters in this repo?

Sort of. The sync adapter is just a periodically scheduled call. It then calls CommunicationHelper, for the actual REST API calls - e.g. https://github.com/e-mission/cordova-server-communication/blob/master/src/android/CommunicationHelper.java#L76

e-mission / e-mission-docs

Migrating trip segmentation feature to native Android project #410