Closed msporny closed 2 months ago
@selfissued. I take your point. But there is a difference in risk management between those risks which we have no ability to control, such as massive earthquakes and atomic bombs, and those which we have an ability to control. Now it might be that AI is so good at impersonating humans that we have no ability to control it. In which case it might be better to say something like "Detecting the difference between an AI system with VCs and a human with VCs can be achieved by the human obtaining a "I am a human" VC from a trusted issuer". Now this can impact the VC DM, because we might decide to standardise the "I am a human" property.
@selfissued,
AI doesn't affect the contents of the VCDM data structures. […] not a consideration for the design of our data structures.
That may be true (although @David-Chadwick in https://github.com/w3c/vc-data-model/pull/1508#issuecomment-2185329214 might be right in additional properties in future).
Therefore, it does not warrant security considerations in this specification.
I do not agree with your conclusion. Many things in the security (or privacy) consideration are primarily meant to implementers: what are the issues they may face in implementing our technologies, how can they prepare their implementations, etc. That is the nature of these sections, that is also why they are not normative but informative (in the true sense of the word).
Given the angle of these sections, I wonder whether it does not call for a complementary recommendation (lowercase R): VC issuers might want to consider adding a claim in their VC about the subject being a human (or an AI), when they have this kind of knowledge, and if it matters for verifiers. This is to be balanced with privacy issues, of course...
VC issuers might want to consider adding a claim in their VC about the subject being a human (or an AI), when they have this kind of knowledge, and if it matters for verifiers.
Unfortunately, while the issuer can indicate that the subject of a claim is a human, there is literally no way to tell if that particular human is present in the process of verification and validation.
The solution we have for this is confidenceMethod, which gives issuers an open-ended way to provide multiple mechanisms for increased confidence that the subject of any given set of claims is, in fact, appropriately related to the presenter (with the easiest case of "same person").
As @selfissued points out, the failure of these modes, e.g., an AI faking identity assurance is something outside the scope of this work. In fact, that is something that should be considered when evaluating the quality of any given confidenceMethod.
In particular, the ultimate answer to "proof of humanity" is, IMO, most likely going to be resolved with strict liability where the proof isn't that the current interaction is "with a human" but demonstrating which human is liable for the AI agent's actions. That is, rather than imagining that we can resolve the unresolvable because frankly, AI deep fakes are going to obliterate all non-cryptographic forms of assurance. What we should be figuring out is how to ensure that we know the human on whose behalf an agent is acting. Just as we don't concern ourselves with browsers, aka user-agents, acting beyond the remit of its users. Instead, we take the actions of the browser as a legitimate expresssion of a user.
I expect we'll get to this through cryptographic assurances and affirmative liability assertions.
FWIW, I think this section doesn't go far enough to explain the different notions of identity assurance and the innate challenges of determining liability when humans use tools, especially browsers and AI.
The issue was discussed in a meeting on 2024-06-26
@decentralgabe I have moved the section, as discussed during the meeting, to the Validation section. Please let us know if you approve of this PR now. Requesting a re-review.
@selfissued Yours is the only objection to adding this section to the specification, do you intent to keep your objection? Requesting a re-review.
@TallTed requesting a re-review from you.
As I wrote earlier, AI doesn't affect the contents of the VCDM data structures. It is tangential at best, and not a consideration for the design of our data structures. Therefore, it does not warrant security considerations in this specification.
The section has been moved to the "Validation" section, per @decentralgabe's request. It does not say anything about AI's relationship to the VCDM data structures.
Are you going to formally object if this text makes its way into the specification?
@jandrieu
Unfortunately, while the issuer can indicate that the subject of a claim is a human, there is literally no way to tell if that particular human is present in the process of verification and validation.
Our model specifically allows for that human (i.e. the subject) NOT to be present. It's always the holder that is present. So if the issuer says that the subject is a human, does it matter whether a bot or a human passes on this information to the verifier?
Of course it's different if the subject is the holder. But then authentication of the holder should determine if the holder possesses the key that the issuer purports to be held by the subject.
Adding text to the spec only makes it better when the text is actionable by implementers and deployers. Discussing the implications of AI doesn't pass this test.
As an analogy, while it's trendy to talk about the impact of AI these days, a decade or so ago it was equally trendy to talk about cloud computing. At the time, there might have been those urging us to add Cloud Computing Considerations to the specification, discussing what the security and privacy implications are of your data being hosted on someone else's computer. Sound silly? I believe that AI Considerations will sound just as silly and dated in a few years.
It's not obvious why it would be silly to talk about the privacy and security implications of data being hosted on someone else's computer. On the contrary, it seems pretty clear why that is a sensible topic for many spec to note as a relevant consideration. It might be assumed by those who know the topic deeply that this is shared knowledge, but I think the reality is that the consideration is rarely raised and so various serious issues arise (in reality, not just hypothetical ones).
"Everyone does it, what's the problem?" seems like a pathway to trouble that the considerations sections are intended to help people avoid.
[…] there might have been those urging us to add Cloud Computing Considerations to the specification, discussing what the security and privacy implications are of your data being hosted on someone else's computer. Sound silly?
Actually... it does not. Repeating what I said in https://github.com/w3c/vc-data-model/pull/1508#issuecomment-2186278463:
Many things in the security (or privacy) consideration are primarily meant to implementers: what are the issues they may face in implementing our technologies, how can they prepare their implementations, etc. That is the nature of these sections, that is also why they are not normative but informative (in the true sense of the word).
Cloud computing does raise privacy issues, and some implementers may have benefitted by such considerations.
At the time, there might have been those urging us to add Cloud Computing Considerations to the specification, discussing what the security and privacy implications are of your data being hosted on someone else's computer. Sound silly?
No, that sounds like a critical oversight; that sort of thinking is what, consciously or unconsciously, led to the surveillance capitalism that drives the Internet and the Web today. It wasn't silly back then and it isn't silly today.
Some of the centralization mess the Internet and Web finds itself in today, and that we're trying to fix, is precisely because people thought that consideration was unimportant. There were a number of us that tried to warn the industry that it was a fundamental change that was going to lead to centralization and all sorts of problematic power dynamics.
So, no, it's not silly and you are making my point (and it seems like others in this thread agree). The advice IS actionable to an implementer, they can consider it and take action vs. not even knowing of the danger until they are surprised by it.
As a related aside, you didn't answer the question: Are you going to formally object over this addition?
The issue was discussed in a meeting on 2024-07-03
@selfissued equating Artificial Intelligence (computer software) with earthquakes (natural geological hazard) is comparing oranges and apples. AI is a term to define computer software that will ingest and process data, controlled by individuals with intentions, to output a model which is very efficient at a specific digital task. AI is also tightly coupled with privacy concerns, which is one of the core principle VC's are addressing in the personal identification sphere. That's 2 strong correlation here, and I'm sure more could be done.
Additionally, people might be tempted to introduce some form of Machine Learning algorithm to interact with verifiable credentials. The spec should outline considerations of doing as such.
I support adding a section about AI to the Security Considerations.
For the Cloud Computing analogy, it's far from a silly concern. If a Canadian company hosts PII in a Cloud Based Service in the United States, this data becomes subject to United States privacy laws and the company is legally liable for what happens to this data. Data Privacy laws are in place for a reason. While some might not value privacy, they are not entitled to make the same judgement about other peoples right for privacy.
The issue was discussed in a meeting on 2024-07-17
Editorial, multiple reviews, changes requested and made, no remaining objections, merging.
This PR is an attempt to address issue #1507 by adding a section about artificial intelligence to the Security Considerations section.
Preview | Diff