orcfax / ITN-Phase-1

Issues repository for Phase One of the Orcfax ITN
0 stars 0 forks source link

Requirements: Personally identifying material in records #1

Open Christian-MK opened 3 weeks ago

Christian-MK commented 3 weeks ago

Describe the issue

Given that the information captured within a validation- record is to be permanently stored on Arweave, it seems prudent that we review and decide on how much (or how little) personally identifiable information (PII) is included.

As of now, I have identified the following as PII within our records:

Most importantly, within our Validation record

lines 10-19

  "contributor": {
    "@type": "Organization",
    "name": "AS31334 Vodafone Deutschland GmbH",
    "locationCreated": {
      "address": {
        "@type": "PostalAddress",
        "addressLocality": "Leipzig",
        "addressRegion": "Saxony, DE,",
        "geo": "x,y"
      },

Which is derived from the Orcfax Collector JSON Format

lines 383-394

        "identity": {
            "node_id": "e1551f35-3791-4849-b229-4381ef334230",
            "location": {
                "ip": "37.4.251.133",
                "city": "Leipzig",
                "region": "Saxony",
                "country": "DE",
                "loc": "x,y",
                "org": "AS3209 Vodafone GmbH",
                "postal": "04103",
                "timezone": "Europe/Berlin",
                "readme": "https://ipinfo.io/"

What component or features of Orcfax are affected?


Is there any further information that needs to be considered?

From Ross:

"the location is not so accurate as to be able to identify an individual's location: https://www.abstractapi.com/guides/ip-geolocation/how-accurate-is-ip-geolocation (for instance, my geo-cords are 2 miles away) -- you can try for yourself with gofer too with the Orcfax output if you want to see it there."

"As to why it's in the code -- we've done it since the beginning, but I think I was probably just answering, "what info might be in a collector record"? - and so that's why it ended up there.

The right question for now I guess is, do you want to use any of the information, or shall we just remove it entirely? or do you want to answer it in the ITN? (these nodes are only our VMs and my info is purely incidental)"

This will become increasingly relevant as we onboard validators who may at any point decide to cease participation, and who may seek to exercise their right to be forgotten under GDPR or similar legislation.


Christian-MK commented 2 weeks ago

Why not to have information of this kind in the record:

Why to have information of this kind in the record:

ross-spencer commented 2 weeks ago

if data points aren't truly identifying (eg "geo"), then why have them?

Definitely think there are probably better ways of demonstrating accountability (a stretch goal I believe was to have one human per validator?, so some form of SSI? (again just recollection of conversations)). If we drop the coordinates from the API response and we go with country/city then there is at least some proof of decentralization (and metrics) that can provide a neat visualization of our network. (I don't often go with "it's neat" for technical solutions so might get some heat for that :sweat_smile: )

Beyond this, for aspects of GDPR (and perhaps our TOS too) can we consider what it would take to get dedicated legal advice on these questions? Otherwise we are at risk of the expense of working around issues that aren't problems because none of us are qualified to interpret correctly.

If for v1 we keep this information because it doesn't incur the cost of removing it, and is only identifying some digital ocean server (but does at least inform our federated network users it's not all running out of a basement in Vancouver).

For the ITN it can all be removed and a task created to consult with the participants about what they're comfortable with. It can all be added in incrementally.

Christian-MK commented 2 weeks ago

It is quite neat!

If for v1 we keep this information because it doesn't incur the cost of removing it,

Yes I believe its understood, after your response in discord, that this does not need to be a priority right now and can be readdressed prior to ITN launch. And your instinct re professional advise is probably right on