w3c / vc-data-model

W3C Verifiable Credentials v2.0 Specification
https://w3c.github.io/vc-data-model/
Other
290 stars 106 forks source link

Add Artificial Intelligence section to Security Considerations. #1508

Closed msporny closed 2 months ago

msporny commented 3 months ago

This PR is an attempt to address issue #1507 by adding a section about artificial intelligence to the Security Considerations section.


Preview | Diff

David-Chadwick commented 3 months ago

@selfissued. I take your point. But there is a difference in risk management between those risks which we have no ability to control, such as massive earthquakes and atomic bombs, and those which we have an ability to control. Now it might be that AI is so good at impersonating humans that we have no ability to control it. In which case it might be better to say something like "Detecting the difference between an AI system with VCs and a human with VCs can be achieved by the human obtaining a "I am a human" VC from a trusted issuer". Now this can impact the VC DM, because we might decide to standardise the "I am a human" property.

iherman commented 3 months ago

@selfissued,

AI doesn't affect the contents of the VCDM data structures. […] not a consideration for the design of our data structures.

That may be true (although @David-Chadwick in https://github.com/w3c/vc-data-model/pull/1508#issuecomment-2185329214 might be right in additional properties in future).

Therefore, it does not warrant security considerations in this specification.

I do not agree with your conclusion. Many things in the security (or privacy) consideration are primarily meant to implementers: what are the issues they may face in implementing our technologies, how can they prepare their implementations, etc. That is the nature of these sections, that is also why they are not normative but informative (in the true sense of the word).

pchampin commented 3 months ago

Given the angle of these sections, I wonder whether it does not call for a complementary recommendation (lowercase R): VC issuers might want to consider adding a claim in their VC about the subject being a human (or an AI), when they have this kind of knowledge, and if it matters for verifiers. This is to be balanced with privacy issues, of course...

jandrieu commented 3 months ago

VC issuers might want to consider adding a claim in their VC about the subject being a human (or an AI), when they have this kind of knowledge, and if it matters for verifiers.

Unfortunately, while the issuer can indicate that the subject of a claim is a human, there is literally no way to tell if that particular human is present in the process of verification and validation.

The solution we have for this is confidenceMethod, which gives issuers an open-ended way to provide multiple mechanisms for increased confidence that the subject of any given set of claims is, in fact, appropriately related to the presenter (with the easiest case of "same person").

As @selfissued points out, the failure of these modes, e.g., an AI faking identity assurance is something outside the scope of this work. In fact, that is something that should be considered when evaluating the quality of any given confidenceMethod.

In particular, the ultimate answer to "proof of humanity" is, IMO, most likely going to be resolved with strict liability where the proof isn't that the current interaction is "with a human" but demonstrating which human is liable for the AI agent's actions. That is, rather than imagining that we can resolve the unresolvable because frankly, AI deep fakes are going to obliterate all non-cryptographic forms of assurance. What we should be figuring out is how to ensure that we know the human on whose behalf an agent is acting. Just as we don't concern ourselves with browsers, aka user-agents, acting beyond the remit of its users. Instead, we take the actions of the browser as a legitimate expresssion of a user.

I expect we'll get to this through cryptographic assurances and affirmative liability assertions.

FWIW, I think this section doesn't go far enough to explain the different notions of identity assurance and the innate challenges of determining liability when humans use tools, especially browsers and AI.

iherman commented 3 months ago

The issue was discussed in a meeting on 2024-06-26

View the transcript #### 2.5. Add Security Considerations related to advances in Artificial Intelligence (issue vc-data-model#1507) _See github issue [vc-data-model#1507](https://github.com/w3c/vc-data-model/issues/1507)._ **Manu Sporny:** I have been working with a number of AI companies on how VCs can be used to determine if an online entity is a real person or an AI bot. … AI systems can now pass Turing test. … how AI affects IDM systems need to be documented. … will be a number of research papers published this summer that go into greater details. _See github pull request [vc-data-model#1508](https://github.com/w3c/vc-data-model/pull/1508)._ **Gabe Cohen:** AI does not affect the VC DM data structures. **Brent Zundel:** are you saying this PR text is in the wrong section of the spec? **Gabe Cohen:** yes. … move to validation/verification section. **Joe Andrieu:** confidenceMethod can be affected by AI. … there is an AI arms race at the moment. > *Joe Andrieu:* +1 to that, Manu. **Manu Sporny:** the text will not provide text on exact solutions, but we should point to research papers when they become available. > *Joe Andrieu:* "We have something AI doesn't have. That is cryptography." That's great framing. > *Steve McCown:* Have we started actively discussing moves towards post quantum cryptography? **Ted Thibodeau Jr.:** AI is a moving target so not something we can solve now. … I have provided substantial text edits to the existing paragraphs. … leave text as is and more text in Validation section. > *Ivan Herman:* +1 to Ted. > *Joe Andrieu:* +1 to Ted. That was a good argument for keeping it in Security Considerations. **David Chadwick:** Joe said that humans have cryptography and AI doesn't, that's a good point, but I think AI can have that too. > *Steve McCown:* AI's are currently being created for brute force attacks on cryptography. > *Manu Sporny:* yes, +1 to what Joe said, that's what I meant too. > *Will Abramson:* +1. > *Steve McCown:* ECC isn't quantum secure... > *Gabe Cohen:* will continue in issue. **David Chadwick:** the issue about cryptography is not that AI cannot use crypto and sign, but rather that AI cannot break crypto. … Therefore AI cannot fake a signed document. > *Steve McCown:* I would contend that AI can break crypto.
msporny commented 3 months ago

@decentralgabe I have moved the section, as discussed during the meeting, to the Validation section. Please let us know if you approve of this PR now. Requesting a re-review.

@selfissued Yours is the only objection to adding this section to the specification, do you intent to keep your objection? Requesting a re-review.

@TallTed requesting a re-review from you.

msporny commented 3 months ago

As I wrote earlier, AI doesn't affect the contents of the VCDM data structures. It is tangential at best, and not a consideration for the design of our data structures. Therefore, it does not warrant security considerations in this specification.

The section has been moved to the "Validation" section, per @decentralgabe's request. It does not say anything about AI's relationship to the VCDM data structures.

Are you going to formally object if this text makes its way into the specification?

David-Chadwick commented 3 months ago

@jandrieu

Unfortunately, while the issuer can indicate that the subject of a claim is a human, there is literally no way to tell if that particular human is present in the process of verification and validation.

Our model specifically allows for that human (i.e. the subject) NOT to be present. It's always the holder that is present. So if the issuer says that the subject is a human, does it matter whether a bot or a human passes on this information to the verifier?

Of course it's different if the subject is the holder. But then authentication of the holder should determine if the holder possesses the key that the issuer purports to be held by the subject.

selfissued commented 3 months ago

Adding text to the spec only makes it better when the text is actionable by implementers and deployers. Discussing the implications of AI doesn't pass this test.

As an analogy, while it's trendy to talk about the impact of AI these days, a decade or so ago it was equally trendy to talk about cloud computing. At the time, there might have been those urging us to add Cloud Computing Considerations to the specification, discussing what the security and privacy implications are of your data being hosted on someone else's computer. Sound silly? I believe that AI Considerations will sound just as silly and dated in a few years.

chaals commented 3 months ago

It's not obvious why it would be silly to talk about the privacy and security implications of data being hosted on someone else's computer. On the contrary, it seems pretty clear why that is a sensible topic for many spec to note as a relevant consideration. It might be assumed by those who know the topic deeply that this is shared knowledge, but I think the reality is that the consideration is rarely raised and so various serious issues arise (in reality, not just hypothetical ones).

"Everyone does it, what's the problem?" seems like a pathway to trouble that the considerations sections are intended to help people avoid.

iherman commented 3 months ago

[…] there might have been those urging us to add Cloud Computing Considerations to the specification, discussing what the security and privacy implications are of your data being hosted on someone else's computer. Sound silly?

Actually... it does not. Repeating what I said in https://github.com/w3c/vc-data-model/pull/1508#issuecomment-2186278463:

Many things in the security (or privacy) consideration are primarily meant to implementers: what are the issues they may face in implementing our technologies, how can they prepare their implementations, etc. That is the nature of these sections, that is also why they are not normative but informative (in the true sense of the word).

Cloud computing does raise privacy issues, and some implementers may have benefitted by such considerations.

msporny commented 2 months ago

At the time, there might have been those urging us to add Cloud Computing Considerations to the specification, discussing what the security and privacy implications are of your data being hosted on someone else's computer. Sound silly?

No, that sounds like a critical oversight; that sort of thinking is what, consciously or unconsciously, led to the surveillance capitalism that drives the Internet and the Web today. It wasn't silly back then and it isn't silly today.

Some of the centralization mess the Internet and Web finds itself in today, and that we're trying to fix, is precisely because people thought that consideration was unimportant. There were a number of us that tried to warn the industry that it was a fundamental change that was going to lead to centralization and all sorts of problematic power dynamics.

So, no, it's not silly and you are making my point (and it seems like others in this thread agree). The advice IS actionable to an implementer, they can consider it and take action vs. not even knowing of the danger until they are surprised by it.

As a related aside, you didn't answer the question: Are you going to formally object over this addition?

iherman commented 2 months ago

The issue was discussed in a meeting on 2024-07-03

View the transcript #### 1.3. Add Security Considerations related to advances in Artificial Intelligence (issue vc-data-model#1507) _See github issue [vc-data-model#1507](https://github.com/w3c/vc-data-model/issues/1507)._ **Brent Zundel:** let's talk about AI! 1507 Add Security Considerations related to advances in Artificial Intelligence. … there are vendors concerned about AI and interactions with VCs. we talked and said it could in validation/verification, or security considerations. getting some pushback from Mike -- let's see if we can find some consensus. _See github pull request [vc-data-model#1508](https://github.com/w3c/vc-data-model/pull/1508)._ **Manu Sporny:** I moved to the validation section as Gabe requested. I know Ted pushed back a bit. It's not out of place in either section. Pulled in all the WG's requests for changes. **Michael Jones:** I have expressed in GitHub. as editors we need to make judgment calls on what is useful/actionable vs what makes the spec longer. This doesn't improve implementations. I don't want stuff in it that I'm embarrassed to see. Should we also have security considerations around cloud computing? I'm puzzled. **Ted Thibodeau Jr.:** Didn't get the joke. There's a difference between cloud computing and an 'active agent' - we know it's an independent actor that can be put to use now in new ways. I think it is a relevant caution. We should say 'be aware of this new thing, a moving target'. … could be decades until things settle down. let's put in a brief warning and move on. **Gabe Cohen:** What would make this more real to you, Mike? Is there language we could change? … Concerns around AI and data legitimacy are real. If we could improve the text that would be good. **Manu Sporny:** I appreciate your opinion Mike. At this point just about everybody is disagreeing with your point. There are people working on some of the largest AI companies in the world working on research around AI and Verifiable Credentials. It quotes the work we're doing here directly. … it is possible for AI to pass tests today that were thought to only be passed by humans before (GRE, high school diploma, etc.). If people are building systems, and the security is built on VCs identifying certain capabilities and proof of personhood...we need to warn that may not be good enough anymore. Security researchers need to take that into accoutn. … captcha is broken now. AIs can solve it better than humans. it would be strange for us to not say something about this. > *Dave Longley:* "VCs that seem like like they might only be acquirable by human persons might also become acquirable by artificial intelligence systems, be aware of this when validating / making decisions". **Manu Sporny:** see no reason to not put this into the spec. **Michael Jones:** Gabe used the word that is key. Is the guidance 'actionable'. Are there things we're recommending? Are there actions that can be taken? If there are actions -- cool. If I get overruled I would rather this be a security consideration. If there is not a validation consideration then it doesn't belong there. **Dave Longley:** Text should say something like 'VCs that seem like they may only be acquired by humans today may be acquired by AI systems' - don't assume only a human can do it. **Manu Sporny:** the philosophy that a spec should only contain normative actionable statements that end up in impls is a philosophy I do not believe we have ever - or should - employ. we have plenty of statements like this today, e.g. describing the ecosystem so implementers can make better decisions. -1 to a notion that everything we write needs to be actionable. … implementers need to be able to take guidance and apply it to their specific use case. **Joe Andrieu:** we do need to write something since people are asking this question and using this technology. 2 differences. 1 - confidence method is part of how we're trying to solve this problem; not figured out yet (still a reserved property). the text does have actionable advice, though we can improve it. need to say something. **Michael Jones:** I like what Dave Longley said - since it is actionable. Verifiers should not assume tests heretofore that were only passable by human beings are not achievable by machines at this point. don't make an assumption that passing a turing test = party is a human being. **Brent Zundel:** thanks Mike. seems like we have a path forward. language in chat. **Manu Sporny:** the language is already in the PR. I would like to stop playing 'go fetch a rock' with this PR. I will integrate Dave's changes. **Michael Jones:** I will re-review after that, please ping me.
PatStLouis commented 2 months ago

@selfissued equating Artificial Intelligence (computer software) with earthquakes (natural geological hazard) is comparing oranges and apples. AI is a term to define computer software that will ingest and process data, controlled by individuals with intentions, to output a model which is very efficient at a specific digital task. AI is also tightly coupled with privacy concerns, which is one of the core principle VC's are addressing in the personal identification sphere. That's 2 strong correlation here, and I'm sure more could be done.

Additionally, people might be tempted to introduce some form of Machine Learning algorithm to interact with verifiable credentials. The spec should outline considerations of doing as such.

I support adding a section about AI to the Security Considerations.

For the Cloud Computing analogy, it's far from a silly concern. If a Canadian company hosts PII in a Cloud Based Service in the United States, this data becomes subject to United States privacy laws and the company is legally liable for what happens to this data. Data Privacy laws are in place for a reason. While some might not value privacy, they are not entitled to make the same judgement about other peoples right for privacy.

iherman commented 2 months ago

The issue was discussed in a meeting on 2024-07-17

View the transcript #### 3.1. Add Artificial Intelligence section to Security Considerations. (pr vc-data-model#1508) _See github pull request [vc-data-model#1508](https://github.com/w3c/vc-data-model/pull/1508)._ **Brent Zundel:** add AI section. lots of approvals. One request for changes that seems editorial. And one request from Mike Jones to not include it at all. **Manu Sporny:** I have applied Ted's suggested changes. **Brent Zundel:** anyone (other than Mike) on the call that would object to merging PR 1508? … Mike you can comment too. **Michael Jones:** What was the update that was going to be made? **Manu Sporny:** Dave's changes. Something like "VCs that seem like they might be only attained by humans... might be used by AI.". **Michael Jones:** Let me re-review it during the call. **Brent Zundel:** pinging Mike may not have happened, so thanks Mike for looking at it. > *Michael Jones:* I still think this is a waste of spec space, but I've withdrawn my objection [https://github.com/w3c/vc-data-model/pull/1508#pullrequestreview-2183294339](https://github.com/w3c/vc-data-model/pull/1508#pullrequestreview-2183294339).
msporny commented 2 months ago

Editorial, multiple reviews, changes requested and made, no remaining objections, merging.