mlcommons / mobile_app_open

Mobile App Open
https://mlcommons.org/en/groups/inference-mobile/
Apache License 2.0
46 stars 22 forks source link

Initial investigation for user results data collection #235

Closed jwookiehong closed 2 years ago

jwookiehong commented 2 years ago

Purpose of investigation is to develop a high level understanding of system components and technologies (firebase). Deliverable will be a design doc. @relja128 Please feel free to add anything.

anhappdev commented 2 years ago

Since there are not many requirements listed, I would suggest a simple approach:

  1. Store data in a text file (CSV or JSON)
  2. Upload text file to a cloud object storage service (like AWS S3 or Google Cloud Storage)
  3. Write a simple script to parse / import those text files for ad hoc analytics.

This simple approach has some advantages:

After sometimes when the use case and requirements are more clear, the data structures are more stable, we can switch to use a DB to store data and integrate with other systems.

d-uzlov commented 2 years ago

Possible services

Firebase provides 2 databases services: Firebase Realtime Database and Cloud Firestorm. They are very similar for our use-case. Both should be free for us if we only upload benchmark results as JSON.

There is also a Firebase Cloud Storage in case we want to upload massive amount of data at some point. This is basically Google Cloud Storage but with Firebase authentication and security rules. If we were to use Google Cloud Storage directly, we would need to implement our own authentication.

Flutter integration

There is a set of official Flutter plugins for Firebase which implement the whole Firebase SDK but they don't support Windows: firebase_core, cloud_firestore, firebase_database

There are also several unofficial plugins that can run anywhere but they have limited features:

We could probably use different plugins for mobile platforms and for Windows but it would require more effort to implement, and then we would still be missing features in the Windows version of the app. I think we should stick to one of the unofficial cross-platform plugins.

Firestorm has been introduced relatively recently, it is better structured, it scales better (and it's cheaper to scale), so I think we should use it unless we want to use some certain authentication methods which are not supported by the plugin that can work with Firestorm.

Security

Firebase requires users to be authenticated. It's possible to use anonymous authentication linked to current device. Support for other authentication methods vary depending on the plugin used for Flutter-Firebase integration.

I'm not sure if we need any special authentication, I think anonymous access should be fine for our app.

There are server-side security rules available on all of the listed Firebase services which we can use to enforce data format and restrict amount of data uploaded.

UI changes

Currently we have a Share button that is used to transfer current result.json file to some messenger, email or some other available text transfer method. We can add a similar Upload button. We can add a switch in settings to upload results automatically.

We don't strictly need any other UI changes to upload results. However, it would be nice to add ability to list results history from current device, obtain global statistics (or statistics for some specific cases), search global results.

Data format

Firebase Realtime Database and Firestorm are NoSQL databases, so we don't need a strict data format beforehand.

Flutter app already creates result.json file, that should theoretically match format of result.json from the old Android app. We should upload this JSON along with device info and app info. Maybe we should modify result.json to include more info for statistics.

We could use Firebase Cloud Storage if we need to upload large files related to submissions.

Exact data scheme will depend on statistics that we decide to generate.

Statistics

I'm not sure which exact statistics are to be considered useful. Here are few ideas I find interesting:

  1. List all results for a certain app_version+backend+tasks_list combination.
  2. List all results for a certain device+app_version+backend+tasks_list combination.
  3. Show average results for a certain device+app_version+backend+tasks_list combination.
  4. List top N results.

Generating statistics

We can use Firebase Functions to automatically update statistics.

Conclusion

We need to decide if we need some special authentication, or if anonymous or basic email+password auth is enough for us. We also need to decide which statistics we want to gather.

freedomtan commented 2 years ago

Please check if we use IMEI or other unique id on Android and used it for authentication. But how IMEI is for cell phone :-(

freedomtan commented 2 years ago

@anhappdev or @d-uzlov help check if the SDK for firebase C++ version provides authentication support. firebase is not a hard requirement

freedomtan commented 2 years ago

AWS could also be an option. @anhappdev please provides some more information.

jwookiehong commented 2 years ago
relja128 commented 2 years ago

1) We should use Firebase as a backend. We already use it elsewhere at MLC, and it works fine. 2) We also need to have a frontend service that will actually write into Firebase vs. having the app do it directly. The path should be app -> frontend -> Firebase. The frontend is where our auth & security is going to live. 3) Lastly, consider writing the actual web site in Flutter. If that works fine, we can get away with having a single code base for our web site and our app, which would be a clear win.

On Tue, Feb 1, 2022 at 8:06 AM jwookiehong @.***> wrote:

  • HW: Device Model, ModelID, SOC version, Maaybe RAM
  • SW: OS, MLPerf version, commitID, backend (SNPE, ENN, OpenVino), backend version
  • Results info (Current data as is, with which backend is used, also, once issue #203 https://github.com/mlcommons/mobile_app_open/issues/203 is resolved, which accelerators were used) No need for logs for now

— Reply to this email directly, view it on GitHub https://github.com/mlcommons/mobile_app_open/issues/235#issuecomment-1026535267, or unsubscribe https://github.com/notifications/unsubscribe-auth/AR3JFWECTD62VWCOSU7N4S3UY6BAZANCNFSM5M5C5XZA . You are receiving this because you were mentioned.Message ID: @.***>

anhappdev commented 2 years ago

I would add more things to consider:

  1. The result to be uploaded should not be manually modifiable.
  2. If the app writes directly to DB / storage, it should append only and not update or delete. If we use a middle layer like point 2 in Relja's comment, then it would be easy to handle this.
  3. NoSQL may make it easy to add or update a data field, but doing that often make it difficult to analyze the data later (missing value, inconsistent field name, etc.). So we should spend time to define a stable DB schema at the beginning.
d-uzlov commented 2 years ago

2) We also need to have a frontend service that will actually write into Firebase

3) Lastly, consider writing the actual web site in Flutter

@relja128 Could you explain what you expect from the website/web service?

Which statistics do we want to gather, maintain and show? Do we want to have some complex authentication there, or should it be automatic?

Do you want to use some other benchmark website(s) as a reference for general functionality?

relja128 commented 2 years ago

1) The web service will ingest data from the client (app), do some sanity checks (data validity check, spam check), and then write the data into Firestore. So think of it as a REST front end with some basic logic.

2) For now we should be storing basic data: device model, manufacturer, timestamp, app version, OS version. Then, for all tests that were run: name of the test, score, accuracy (if available).

You do not need to normalize the data. Just create a single Firestore record ('document') and stick all of this in there. We can later slice & dice the data as needed.

Keep in mind that this is all going to a nosql database, so we don't need to declare a table structure or enumerate all the various data types right now. We can add them in later.

3) No auth needed just yet, since this is all internal. We will eventually have to figure out how to do auth, but I wouldn't try to solve it yet.

4) For the result browser side, for now build a minimal implementation (query & result display). We will over time hire a UX person to help us with this, but this side of the project for now should be minimal functionality just to show that it works.

d-uzlov commented 2 years ago

Implementation details: Database is Firebase Cloud Firestore. REST API will be deployed in Firebase Cloud Functions. Cloud Functions have official integration with Firestore and with Firebase Authentication. We can use firebase_dart plugin to add Firebase auth to our app.

We can use Flutter to create a website, but Flutter can only do a Web Application (it doesn't fully support native web features, like browsing history, text selection, etc.). I think we should still use Flutter for a crude prototype, to create it faster, but we will likely need to switch to something else if we decide to invest into website functionality.

freedomtan commented 2 years ago

order: soc vendors, tech press, then general audience

d-uzlov commented 2 years ago

I think further investigation and improvements should be tracked in other issues. Final decisions from this issue were replicated in related #251, #252, #253 issues.