vircadia / vircadia-native-core

Vircadia open source agent-based metaverse ecosystem.
https://vircadia.com/
Other
533 stars 176 forks source link

GDPR / Data compliance? #1372

Open samuk opened 2 years ago

samuk commented 2 years ago

Is the design of Vircadia inherently incompatible with the GDPR?

"Vircadia is a fully decentralized metaverse solution, comprised of multiple components. It allows users to send information (such as audio, video, text, images, 3D models, and so on) to other users and the servers on which they reside and communicate with.

However, there are some services on Vircadia’s installs that do connect to servers hosted by the Vircadia open source collaborators and its core team. This is because defaults are necessary for a proper first-time experience."

https://vircadia.com/termsofservice/

Do you have anyone working on data protection stuff? It would be a shame if it couldn't be used by any organisation in the EU

digisomni commented 2 years ago

It's hard to fully understand the extent of GDPR requirements in this context. However there are multiple options available.

  1. It's completely open source (Apache 2.0) so rolling your own client, server, and metaverse solution is entirely possible.
  2. The architecture works in a way that does not require connection right off the bat, and for connections that do happen, they do not require accounts, so it's an anonymous connection purely for getting the application bootstrapped.
  3. Furthermore, it is not necessary to have connections to anything but the server you are trying to visit.

The reason for that disclaimer is because we try to take advantage of whatever systems are available. For example, UDP hole punching, metaverse accounts for security and authentication, etc.

samuk commented 2 years ago

Thanks, rolling my own metaverse is an option, but seems like a lot of work. I'd really like to start with a SAAS option if that's possible.

It seems that if Vircadia entered into legally binding agreements with the contributors providing those hosting resources then it might be possible to have GDPR compliance, so long as there were processes for users to request/delete their data?

digisomni commented 2 years ago

Running your own metaverse takes a little work but it's not the worst thing on earth if you're tech-savvy, afaik. :) SaaS is something we're working on making a reality. The first thing we're working on is easy deployments from AWS, Digital Ocean, and Linode for the Domain server.

daleglass commented 2 years ago

Well, what do you mean by "their data"? I don't think we can even identify such a thing for the most part.

Content providing servers have no idea who they're serving the data for. They know the IP address of course, but that's it. So what would an user request from the host? You couldn't request for the data related to "samuk", because no such thing exists. The server might have logged that it served wall.jpg to address 1.2.3.4 at 10 AM, and that's about it. Given that with the lack of IPv4 address space lots of people are on dynamic addresses, that likely doesn't point to anybody in particular, even.

samuk commented 2 years ago

@daleglass thanks that's clearer.

I'm a complete newbie to this..

So the domain server is probably the only one that needs to ensure compliance?

Currently, the default domain servers don't AFAIK? Are you interested in working on this? I'd be up for helping out to the extent that I'm able. (I'm a data protection officer for a UK NGO with global reach).

Is Vircadia an organisation legally at this point? I couldn't identify a legal entity?

daleglass commented 2 years ago

Yes, the domain server is the most likely one to collect something identifiable.

By default you can connect anonymously, without registering an account anywhere or even naming yourself.

And sure, maybe we can meet some time and talk about it? We can meet in-world, or we have a Discord at https://discord.com/invite/Pvx2vke

It's not a proper organization, no. We're just a bunch of people from different countries.

samuk commented 2 years ago

Thanks, up for a chat in-world sometime. GDPR kind of assumes a bunch of legal entities that can enter into binding agreements. I'll have a think about it in the absence of a legal entity.

Have you thought about registering as an NGO somewhere?

daleglass commented 2 years ago

It's been discussed a bit, but so far there's been little progress on that front, but I imagine eventually it would have to happen.

It's probably going to be a bit tricky with us not even being on the same continents. If you've got any advice to give on this subject, it'd be most appreciated.

We've got a community meeting tomorrow, it's a good time to meet a whole bunch of people, see https://vircadia.com/events/

Or I'll be doing some coding and testing and should be mostly hanging around the Hub today.

samuk commented 2 years ago

Thanks will try and get in later today.

Until Brexit I'd have suggested a UK Community Interest Company (CIC) it's low overhead, cheap (£35), can have overseas directors it gives you NGO status for most grant funding, etc. It may still be worth investigating, but our data protection laws may diverge from the EU over time which might complicate GDPR compliance.

It's worth being in EU to access EU funding I reckon eg the €250m fund: https://www.ngi.eu/about/ It looks like Ireland has CIC's too, but from memory their residency conditions are tighter.

For example, you could apply to this to do work on data portability between domain servers, perhaps integrating something like Solid Pods into your stack.

digisomni commented 2 years ago

Mr. Blue had a suggestion: a good step for handling data in a way that makes sense for the end-user would be to have a function that gathers and wipes identifying data on a particular user if requested.

Penguin-Guru commented 2 years ago

And I suggested that there might be implications to the publicly sourced content used by entities, apps, and such. For example, if User A sets up a video entity for User B to watch, the nearby User C could have unintended access to it unless the area is secured. If either user's traffic is not properly secured, MITM vulnerability could be a violation. I believe compliance in some industries, like medical, is generally considered to require full end-to-end encryption. While I don't believe Vircadia would be liable for breaches in those respects, my understanding is that it would not be considered GDPR compliant for those applications.

Strangely, it seems they may also require service providers to have a key to decrypt the data on request. I am not sure how that is being reconciled in those industries because these requests do not seem common so far. Not something I have really looked into.

daleglass commented 2 years ago

If you have a specialized environment with very specific security and reporting requirements my suggestions would be one of:

  1. Do everything in-house. Host on your own servers, both domains and content. Have your own controlled clients. We have no central grid so this is easy.
  2. A weaker version of the above perhaps. Host your own domains, and lock them down. Have their access go through proxy/firewall that would ensure that for instance scripts running on the domain don't interact with anything they shouldn't.

The first would be my bet, but the second might be viable for somebody who wants to run a public domain, but needs to enforce a specific policy within it.

Penguin-Guru commented 2 years ago

Yes, those cases would probably be fine as long as they aren't using any sensitive content in entities or apps that use an insecure connection to the server. It's just important to be clear about whether the software is compliant, otherwise the project could get in trouble for misleading people in those industries. This is the sort of use case where organisations should probably hire a consultant from the community to ensure everything is set up properly, or I suppose we could have specific instructions on how to stay compliant if we want to claim that we are.

samuk commented 2 years ago

a good step for handling data in a way that makes sense for the end-user would be to have a function that gathers and wipes identifying data on a particular user if requested.

Yes, that would go some way towards GDPR compliance. Full compliance would mean a policy and set of processes something like;

I think if you want Vircadia SAAS to be used by organisations in Europe then you'll probably have to register as a NGO somewhere. It's hard to see how you can act as a data controller or processor without being an organisation. Even if you could find a way to 'work around' it as individuals, you'd then be leaving yourselves open for personal liability for fines, in theory this can be up to 21m Euro, which is probably not desirable.

If you just want to offer downloadable software and no cloud services of any kind in Europe then it's probably possible to do that without registering a organisation, but I'd still pay attention to things like your Discourse instance and anywhere else you collect an email address.

A good first step would be a data audit to work out what data you currently hold and where alongside considering a legal structure of some kind.

This guide might help if you do consider a euro NGO https://www.a4id.org/wp-content/uploads/2017/02/EU-registration-options-for-UK-NGOs-post-Brexit-FINAL-PDF-1.pdf

A Dutch foundation looks quite promising at first glance.

As an aside have you considered setting up an https://opencollective.com/ page to collect donations? It might generate enough income to pay NGO registration costs for example.

daleglass commented 2 years ago

An organization probably needs to happen at some point, but it sounds tricky. Of the team, only I'm in the EU, and I'll need to do a lot of research on this sort of thing. Like what kind of organization we need, how it should be set up, whether it's possible or makes sense to register one in a country I don't live in, how it would relate to any US-based organization that might also come into existence...

digisomni commented 2 years ago

When serving customers and users alike for mass adoption, the intent is to build companies that serve those needs. That will be their responsibility to execute on. It's not within the scope of Vircadia currently to tackle this sort of thing. Vircadia is primarily focused on developing the core software and building a developer community.

samuk commented 2 years ago

Ok fair enough. It's probably still necessary to take design and coding decisions that enable GDPR compatibility when hosted on an entirely isolated server making no connection to any Vircadia or community resources.

If there are no mechanisms for user export or deletion then it's hard to see how the software can be compatible, even when completely self-hosted.

digisomni commented 2 years ago

Yeppers, we'll have to get those things implemented. Hopefully we'll get it done sooner than later, if not, then with more support on the enterprise front it would fund the efforts needed to make these changes (and others) that make it more amicable for widespread adoption.

Edit: clarification

samuk commented 2 years ago

As an aside have you considered setting up an https://opencollective.com/ page to collect donations?

digisomni commented 2 years ago

Yes. We're not doing that yet but it's a potential option for the future.

samuk commented 2 years ago

https://docs.vircadia.com/host/add-content/export-content.html#id3

"By default, Vircadia creates regular content archives of all active domains on the metaverse."

How would I change that default? Or is it not possible to register at the metaverse and remain compliant?

SilverfishVR commented 2 years ago

I would agree the the wording of that does make it sound like Vircadia (as legal entity) is retrieving and storing the content of your domains on Vircadia owned servers. I am fairly very confident that is not the case, though the phrase "all active domains on the metaverse" is confusing in that sense.

The domain server (local or cloud hosted) by default makes regular rolling that you, as domain server admin/owner can configure in terms of frequency and number of backups stored so you have full control over how long information is stored, the default server settings are like this: Screenshot 2021-10-08 220617

The domain server admin/owner can, ofcourse, choose to download and permanently store a backup, violating the limited time storage requirement but even then it will only have GDPR implications is yor domain allows permanent user edits and since regular, time limited, backups are basicly required to provide the service, you only need to properly inform the user of this to be compliant.

On a personal note I think it is important to distinguish between what falls under the responsibility of the metaverse server owner and what applies to the domain server owner. If a government entity or similar wants to use the platform I would assume that they would want have their own metaverse server and take full control, and responsibility, of users and their data, and with Vircadia they can, unlike most other platforms. That said, ideally the domain server defaults should be GDPR compliant (I believe they are) and come with boilerplate TOS that are compliant. the current default metaverse server could proberbly get get almost there with the addition of "erase me" and "download my data" buttons.

GDPR seem like a this bureaucratic hassle, and it sort of is, but is was created in the same spirit that made Vircadia, Personal freedom, agency, and ownership.and ideally, no lawyers are involved in that, it's just the the default setting 😄

daleglass commented 2 years ago

I concur with what @SilverfishVR said.

@samuk Also to clarify: what is being backed up is the state of what you see in the world. Not chat, avatar presence data, or any kind of detailed recording of people's activity. If you log in, spend an hour talking to people, and place a box in-world, all that's saved is the box. If you don't edit anything, none of your activity will make it into the backup.

The domain does attach a "last edited by" identifier to objects, but I don't think it tells very much. This should be a temporary ID that varies in between different connections of the same person to a domain, and doesn't tell you what they changed about the object.

stale[bot] commented 2 years ago

Hello! Is this still an issue?