corona-warn-app / cwa-wishlist

Central repository to collect community feature requests and improvements. The CWA development ends on May 31, 2023. You still can warn other users until April 30, 2023. More information:
https://coronawarn.app/en/faq/#ramp_down
Apache License 2.0
105 stars 14 forks source link

Feature Request and Implementation Concept: report/map risky contacts - show, where and when risky contats happened #288

Open HEAK20 opened 3 years ago

HEAK20 commented 3 years ago

Preface(s)

  1. I would like to mention, I was not able to find any similar inside so far.
  2. This is not a pure feature request but more likely a feature request and a technical implementation proposal.
  3. I am a newbie on Github, so be gracius, please :-) However, would be great if you may kindly evaluate my proposal presented.

User Story

As a CITICEN (USER, SCINECTIST, OFFICIAL, ....) I would like to know, WHERE and WHEN user of the Corona-App did have risk-contacts (in their various RISK STATE). I would like to SEE the LOCATION ON A MAP. The mechanism should be VOLUNTARY, ANONYMOUS and might be implemented on a CENTRAL SERVER.

My Concept

I do propose a concept to report contacts (including their risk indication) anonymous to a central server, that is able to provide maps indicating risky contacts in terms of "where" and "when".

Implementing this would allow to answer the question "when and where do the risky contacts happen?". The concept proposed is attached (unfortunately written in German).

Konzept zu Ort- und Zeit.pdf

Main Idea (behind the concept)

If both partners in contact (i.e. exchanging their IDs) are able to identify, what part of the full set of information (on location and time of interaction) is to be stored by them, the full information can be divided in to parts that does not hold real information. Each partner may store (uncompleted) information of location and time, together with a unique HASH generated from IDs of both partners.

When a risk evaluation shows up a match, each partner send the HASH and its uncompleted information (on location and time) to a central server. By the HASH, both incomplete informant sent can be combined to a full set of location and time.


Internal Tracking ID: EXPOSUREAPP-4656

jucktnich commented 3 years ago

The problem is, if the app uses the ENF (Exposure-Notification-Framework) it can't access to location. There are to options:

Ein-Tim commented 3 years ago

See also @thomasaugsten's comment here:

There are no plans from google and apple side to provide location or the exact time. In the future we will integrate a list of days and the number of exposures on this days


Related:

MikeMcC399 commented 3 years ago

The RKI dashboard

https://corona.rki.de/

gives a map showing the incidence which you can equate to risk.

As @jucktnich said, Exposure Notification does not record location.


The privacy notice of the Corona-Warn-App says:

"5. What data is processed? The app’s entire system has been programmed to process as little personal data as possible. This means that the system does not collect any data that would allow the RKI or other users to infer your identity, your name, your location or other personal details."


Google COVID-19 Exposure Notifications Service Additional Terms says:

c. Permissions. i. Your App may not request the Location, Bluetooth_Admin, Special Access, Privileged, or Signature permissions, or collect any device information to identify or track the precise location of end users.


So the contract between end user and RKI, and between RKI and Google would be violated if any location information were to be collected.

HEAK20 commented 3 years ago

Well, valid point: a central assumption is, the App does know location and time information ... and if this assumption is not true, the concept does not work. This is something, I was not aware of (stupid one).

The both related feature request #206 and #266 does not spent any tough on anonymity and primacy. The proposal presented would offer an option to provide location and time of risky contacts for a special dashboard (for the sake of the community, not for the sake of individual users).

Voluntary use

I do see the option to use the location information on a voluntary base to allow analysis of risk contact distribution. In case user are asked to allow access to location data for this specific purpose, it should work.

A (somehow sophisticated) comment on privacy notice of the Corona-Warn-App

According to the concept proposed, the App would use the location to extract partially stored information, that will never be sufficient to reveal the location information (as the information stored locally is not useful). The full information (on location and time) is solely avaiable at the server, when a contact occured and both devices are sending the information to server.

A comment on Google COVID-19 Exposure Notifications Service Additional Terms

Well, according to the proposal, the App will request the location. However, not to identify or track the precise location of end users, but more likely to locate location and time of an (anonmyous) contact (in case this contact happened).

Thank you!

Thank you for your comments.

Would be really great, if I would see some time in future a feature providing this service.

Ein-Tim commented 3 years ago

@HEAK20

Well, according to the proposal, the App will request the location. However, not to identify or track the precise location of end users, but more likely to locate location and time of an (anonmyous) contact (in case this contact happened).

Yeah but to locate the location of an encounter the App would constantly have to track the location of the user, or do I get something wrong here?

HEAK20 commented 3 years ago

@Ein-Tim: Thank your for your comment.

Maybe I was not precise enough. Not constantly, but whenever IDs are exchanged, the location and schedule information are stored distributed over both parties. Yes, location has to be tracked, but not constantly - only in case of "seeing" an other App-User.

In case the functionality is to be allowed by user, the mechanism could be blocked if one party do not agree.

When the contact (the ID exchange) is to be identified later as a "risky contact", the distributed information will be sent to a server, combined to a full dataset to report time and schedule of a risk contact on a central map (for all the contacts reported in total, not on a per user base).

To avoid personal tracking, the data collected by the server might not be displayed as long as a specific number of events is to be accumulated. Not a any single risk-contact is to be reported but more likely only those areas on a map, where at least 5 (or 10) contacts has been reported.

jucktnich commented 3 years ago

When location only should be tracked when IDs are exchanged, this must be integrated on OS level.

HEAK20 commented 3 years ago

@jucktnich: Thank you again for the clarification. This is the final argument against (my) idea to collect location information only when IDs are exchanged.

@Ein-Tim: You are right. If the OS does not provide the location, the functionality has to be implemented in the App. And I am afraid, the App will not be notified, when at OS-Level IDs are exchanged (I am right?).

In this case I understand, my idea can not be implemented without modifying the OS-Function. (sniff)

Ein-Tim commented 3 years ago

@HEAK20

And I am afraid, the App will not be notified, when at OS-Level IDs are exchanged (I am right?).

Yes this is right. But please leave this Issue open, since this is a good idea for a implementation which maybe could follow later... (although this seems not very likely).

HEAK20 commented 3 years ago

@Ein-Tim: When comparing the request #266 with my request, the proposal presented here seems to be of higher value (I am afraid).

I guess, a request to modify an OS-function is not easy to fight trough. When going this way (i.e. for #266) I would like to recommend to proof if my proposal shall be placed - either as a replacement and/or as an alternate option.

However, I will leave the issue open. Thank you all for your contribution - it was a pleasure!

Cheers, Helmut

Ein-Tim commented 3 years ago

Thank you for you contributions!

Just one last thing, not only for the proposal in #266 but also for yours the ENF would have to be updated or? Because CWA does never get in touch with RPIs so even if Apple/Google would allow CWA to request the location this would not help with your proposal...

HEAK20 commented 3 years ago

@Ein-Tim: sorry, I am lost - too many abbreviations. I am happy to take the next step but do not know what is required, now. What does ENF, CWA or RPI mean?

To be open, I felt it worth to share my idea, but I am not familary with the next step. Please advice.

Thank you!

Ein-Tim commented 3 years ago

@HEAK20

No Problem, just to clarify the abbreviations:

ENF - Exposure Notification Framework, this is the API used by the Corona-Warn App. The API is programmed and maintained by Google/Apple. (Under Google it's named: ENS - Exposure Notification System)

CWA - Corona-Warn App

RPI - Rolling Proximity Identifier, this is what your phone sends and records. (for more information see https://www.coronawarn.app/en/ and scroll down to the section "How the App works").

To be open, I felt it worth to share my idea, but I am not familary with the next step. Please advice.

It's definitely worth it! The next step will be that this is mirrored to the internal (not open to public) JIRA, but this will be done by one of the nice community managers here. You don't have to do anything more, the community managers will let you know what the status of your request is.

Thanks again for your contribution!

Stay safe and healthy!

heinezen commented 3 years ago

Hey @HEAK20 ,

Thank you for your suggestions. Since your proposal is currently not compatible with the architecture of the Google/Apple ENF and would require fundamental changes to the app, I think it would be good to add more beneficial use cases to your concept motivation. This will help us and the devs to better evaluate the benefits of this approach, especially in comparison to other proposals that work with the current design of the CWA.

Additionally, regarding your concept PDF there are some arguments that could be explained further:

Thank you, CH


Corona-Warn-App Open Source Team

HEAK20 commented 3 years ago

Hey @heinezen, thank you for evaluating my proposal. Here are my answers to the three questions raised.

How does halving the location/time data guarantee anonymity?

Well, anonymity is required in two totally different aspects.

  1. First aspect is (obviously), that anyone with access to data of the central server infrastructure, shall not be able to identify anyone sending data to the central server.
  2. Second aspect (more of interest for the use case in focus) is, that any user shall not know the person infected, he was in contact with.

Although a user has access to the data stored locally on his mobile device, he shall not be able to identify the person he was in contact with. Thus, the user shall not know location and time of a contact, otherwise he would be able to discover the person he was in contact with. A user may easily identify a person he was in contact at a specific location (i.e. "Sport Bar") and at a specific time (8 pm at 22nd of Nov), as he might have met someone (i.e. a friend) at this point.

Hiding the full set of information on location and time, would avoid to identify specific persons met. If that person (i.e. the friend, I met at 8pm in the "Sport Bar") give me a call when he has been tested Covid-positive, this is fine - as he might do it voluntary).

How do you ensure that both A and B are authorized to send valid data to the central server (and no one else can send arbitrary location data)

Valid point, I did not spent any thought right now. If the (partial) data (stored on both devices) will be sent to central server together with a Hash generated out of the IDs of both parties, a single user may not send arbitrary data. A single (partially) dataset will not be able to be combined with a second (partially) dataset unless a second dataset is sent using the same Hash. Off course, if someone would like to generate false data, he may generate a Hash and may sent two (partially) datasets to produce a false dataset. This could be avoided by cryptologic measures. I.e., if a secret is part of the Hash and anyone has generated without a valid secret, this data is to be identified as invalid.

An other option would to sent an identifier to authenticate the communication partner. However, as I do not know the transmission protocol, I do not know if there is a chance to do so (authentication might be required to transmit the IDs securely).

What exactly is used as the ID (A-ID and B-ID)? The Temporary Exposure Keys?

Yes, I would guess the Exposure Key can be used, as they are exchanged. The only thing required is to distinguishes them in an A-ID and a B-ID to identify the 'right' part of data stored at the local device.

Thank you again for the kind review. I hope my answers are fine - otherwise please come back again.

Cheers, Helmut

HEAK20 commented 3 years ago

Sorry, I closed this item by accident ...

heinezen commented 3 years ago

Hey @HEAK20 ,

Sorry for not coming back to this sooner. I think a big problem with your approach is that introducing identifiers and shared secrets to the workflow of the app would be that it would make encounters between two devices identifiable on the server side. Keep in mind that part of the reason why people use the app is that even the CWA dev team and the RKI cannot know (by design) if two people had an encounter. Identifiers would not be anonymous even if they are only stored for a short time.

Maybe you can think of a better exchange protocol that ensures both users are authorized while keeping the encounter anonymous. I have to be a bit pedantic here because this topic touches the ore principles of the app.

Best Regards, CH


Corona-Warn-App Open Source Team

Ein-Tim commented 3 years ago

@heinezen The current process is that only infected people upload something to the server, every other user only downloads data.

If we want to keep this, it could work like this (and I think this is more or less the proposal made by @HEAK20):

Step 1: User A encounters user B, in addition to the data which is already exchanged at the moment, the location of the meeting is fetched, is splitted so that one part is saved by the Device of user A and the other part by the device of user B. The same is done with the time of the encounter.

Step 2: User B is infected and uploads his DKs. In addition to the data what is already uploaded now, his parts of encounter time and encounter location is uploaded. On the server only the half/incomplete location & half/incomplete time is received.

Step 3: User A downloads the data from the server. The App/ENF remembers which location & time part of the locally stored parts fits to the ones downloaded from the server & completes them. This only happens locally on the device, not on the server.

HEAK20 commented 3 years ago

@Ein-Tim: Yes you are right. You summed up the basic idea.

@heinezen: Additional secrects and additional cryptography will only be required to avoid fraud by sending arbitrary location data. I did not spent to much toughs on it. As I understand, the function proposed is to be implemented in the ENF. I have some hope, that the ENF secures the communication between devices :-)

Central Server part

In respect to the three steps listed above by @Ein-Tim, I would like to recommend a forth step: The information on contact location & time shall be reported to a central server (i.e. the one, the CWA is connected with), as this would allow analysis on hot spot locations and events that could be shared with the public.

This can be done anonymously: both users shall transmit a hash (that is unique for their contact and can be generated aut of the IDs and/or location and time) and the fetched location & time data to central server. Using the hash, the Server is able to reconstruct location & time.

Once again, thank you all for kindly evaluating my proposal.