smcgregor commented 1 year ago

This is a working document with some elements that are ready for development

While there is convergence on what constitutes an "AI incident", there are still considerable differences between definitions of how the concept of "incident in waiting" is defined. We call these "issues," the OECD is calling them "hazards," Robust Intelligence calls them "risks," ARVA calls them "vulnerabilities," AIAAIC calls them "controversies," and various algorithmic assessment organizations call them audits (or at least, audits will always produce one or more incident in waiting). All these things vary subtly in definition, application, and use between organizations.

The role of the Responsible AI Collaborative going back to the original research publication has always been to act as the union of multiple perspectives and provide tools to support sharing across those perspectives. This is a challenging proposition. Pretty much any multi-stakeholder ontological project I am aware of has inevitably degenerated into never ending discussions over the most difficult elements to define. For something that has no underlying, singular "right" answer, it is best to find ways of moving forward that don't require universal agreement. The purpose of this GitHub issue is to detail how to proceed technologically without needing to resolve the definitional question of "incident in waiting".

The AIID's current entrant into this space is the "AI Issue." We chose this term intentionally to cover multiple aspects of "incident in waiting." It is meant to be specific enough to capture elements of risk, while general enough to cover the field. The "issue" term also means we can index concepts covered by other communities and link out to those communities if/when they operate their own processes. While we would prefer such organizations join the Responsible AI Collaborative and integrate from the beginning, that will not be possible universally (e.g., when a database is operated by a sovereign state). Therefore, we need to maintain flexibility.

This also plugs into the drive for federating the AI Incident Database -- something that we will soon have a test case for with an index of deepfakes. Incident databases for things like deepfakes require different editing processes and metadata. How federation works with incidents is fairly clear. Incidents have a natural scope that will support federating responsibilities among multiple nodes. However, this does not work for incidents in waiting. Often there is no concrete definition of what specific system can produce the incident. Worse, all systems will produce a great many incidents when placed into the wrong context. Behind every system is an infinity of risks. This is why the ForHumanity audit criteria centers on these four elements,

Scope: The boundaries of a system, what is covered, what is not covered
Nature: The forces and processes that influence and control the variables and features
Purpose: The aim or goal of a system
Context: The circumstances in which an event occurs; including jurisdiction and/or location, behaviour and functional inputs to an AAA System that are appropriate

Without some variation of these elements, the risks produceable by a system cannot be bounded and/or expressed in any meaningful or useful way. For example, a LLM can be applied to an infinity of applications (safely or unsafely), while a webserver logging vulnerability is inherently scoped to the webserver. LLMs are scope/context free and yet present incidents in waiting in a massive array of circumstances. There is no closed world within which to index their risks so it defies enumeration.

More concretely,

Problem: The safety community currently lacks an enumerable definition of "system+context" and we are likely to never have one. The notion of system constantly changes in version, deployment circumstance, organizational processes, etc. The world context for these systems also similarly evolves through time. Absent a more universal grounding of system+context it is not possible to enumerate in a useful way. There will be too much noise.

Solution: Organizing Issues in terms of a numeric identifier or hierarchical structure is a road to editorial ruin. Don't attempt to universally enumerate context-free risk. Instead of organizing issues according to a definite scope, issues themselves can be tagged according to salient attributes, then those tags can then be queried according to values of interest that populate a listing.

Let me introduce by example.

Example Applied to a LLM

<< For illustrative purposes only >>

Press Release: "Dolittle LLM runs all LLMs produced to date with RLHF selecting among candidate outputs to produce an unbeatable hybrid LLM."

audit: "Dolittle can generate several classes of malware through prompt hacking, Dolittle may attempt to end people's marriages"
Audit Metadata {identifiers for hundreds of constituent LLMs, scope, nature, purpose, context, structured representation of findings, ...}

hazard, risk, and vulnerability Record Metadata: {identifiers for hundreds of constituent LLMs, additional reporting, various taxonomies...}

controversy 1: "This new superintelligent AI is coming for your marriage" Controversy Metadata {company, ...}

(subsequent incident) "Incident 27311: Dolittle LLM allegedly produced malware that subsequently destroyed the records of 17 hospital systems"
Metadata {Relevant Issue reports, Event Date, Alleged Developer, Alleged Deployer(s), Alleged Harmed Party(ies), Event Data, ...}

Now what can we do with this? Let's consider each of the report types as issue reports and present them all in a new page, but first we need to decide which reports are queried.

Populating an Issue Profile from a Query

Here I am introducing a new collection type of "Issue Profile," which is something that is programmatically generated from reports and never edited directly.

It is easy to present singular reports in isolation. That is what we are already doing here. What we are missing is some notion issue profiles whereby elements of audit, risk, vulnerability, etc. can be jointly presented. Issue profiles can be queried from the collection of metadata expressed across all reports.

User Story 1: "I want to know whether a particular model I am considering using has been implicated in any risks so I can decide whether I integrate it into my product"
Query: {select the model and its target operating context and see what returns}

User Story 2: "I want to know whether a particular scope has been identified as at-risk in an audit for any systems so I can know what to worry about"
Query: {select the model and its target operating context and see what returns}

User Story 3: "I want to know all the examples of LLM jailbreaks consistent with the Dolittle model so I can begin training safety systems"
Query: {select vulnerabilities for the Dolittle system and subset to input/output data}

User Story 4: "I want to monitor the space of emerging risks across all similarly disposed systems"
Query: {select a collection of similarly positioned systems}

After generating the query, what gets displayed?

New Page Type for Joining Issue Reports Returned by the Query

Right now the /cite/### pages have the following sections,

Title
Description
Tools
Stats
Taxonomies
Reports timeline
Reports
Variants

We can define each of these as follows,

Title: Programmatically generate "{All Versions of Dolittle LLM} applied within the context of {adversarially-produced malware generation}"
Description: (None)
Tools: This would be a pallet of actions that can be taken on the query, such as "Subscribe" and "Cite", along with another button for creating a new incident from the assemblage.
Stats: Information summarizing the reports that have returned
Taxonomies: (None at present, these would be scoped to reports)
Reports timeline: Render the reports according to their publication dates
Reports: The full index of information pulled from the report or the related databases

Much of this still requires discussion, but there are several elements on which we can proceed.

Required Functionality in Codebase

These are likely "Epics" in the agile world.

Today (ready for work)

[ ] Adopt definitions for Audit, Hazard, Vulnerability, Risk, Assessment, and Controversy, then index all records from databases disposed towards interoperability meeting those definitions. Databases with useful but non-conforming definitions can also be indexed, but under an "other records" heading.
- 2292
[ ] Tag all issue reports with the appropriate lower-level term as adopted above
[ ] Create a new page, "/apps/reports/", which reuses many of the components of the "Discover" application, but is centered around constructing queries into the MongoDB database rather than the Algolia index. As the query is constructed the URL rewrites to include the query parameters. When the user is done or when the user refreshes the page, the panel for forming the query collapses and the page is made to look more similar to the /cite/### pages. Design wise, although the page is populated like an Amazon shopping cart, it can be presented more like the static /cite/### pages where the tags are headings with report cards underneath. Whoever picks this up should talk with @lmcnulty since there are overlaps here with the risk checklisting work.
[x] Extend the taxonomy component stack to be applicable to reports. It currently can only be applied to incident profiles. By making it applicable to reports, it becomes possible to index the structured data contained within related databases that don't necessarily have an incident number.
[ ] Add attachment support to reports, initially only for PDFs. We will likely start with these in the MongoDB database, but eventually rip them out and put them in S3 or similar.
- [ ] #2294
[ ] Write a migration to pull in all the data associated with the AI Litigation Database and structure all the data in a new taxonomy consistent with its tabular fields. This is a stress test to help us figure out what is necessary for bringing in additional record types. Walk Bob at GW through the results and develop next steps.
- 2281
[ ] Change the layout of the /cite/### page so that taxonomies are added from the tools panel rather than a panel that displays to every permissioned user. We are going to have a lot more taxonomies.
[ ] Produce a taxonomy of audits then collect a set of audits to populate this as a queryable element.

Soon (needs more definition)

[ ] SBOM-like indexing and entity resolution of systems. This is a huge topic requiring a lot more ink.
[ ] Validate and index the data of TeslaDeaths

Eventually (whenever other efforts become ready)

[ ] Integrate functionality represented here with the risk checklisting project
[ ] Programmatically synchronize with emerging databases

Many flowers are blooming. We look to make a bouquet.

cesarvarela commented 1 year ago

To get us all on the same page, this is the current schema:

cesarvarela commented 1 year ago

Update option n1:

allow classifications to be linked to multiple reports and/or incidents
create new taxa for each of these new types of issues and their metadata attributes (audits, hazards, etc.)

For example, we create new taxa to store systems, such as the Dolittle LLM, and another taxa to store controversies metadata. Then when a new controversy is added to the AIAAIC, a new report is created and linked to the appropriate AIAAIC and LLM classifications.

Something I don't like about this approach is that we might push the taxonomy concept too far. Technically we could make everything a taxonomy, and right now, querying taxonomies doesn't has the best experience because attributes values are serialized (this is fixable with a custom graphql endpoint)

cesarvarela commented 1 year ago

Update option n2:

allow classifications to be linked to reports (but only one)
introduce a systems collection to store stuff like the "Dolittle LLM" and allow them to be linked to one or multiple reports

To index the Dolittle Audit, we do the following:

create a new report
create a new classification using the Audit namespace, storing all the audit metadata, and link it to the new report
create one or more systems (if they don't already exist) and link them to the new report. In this case, we could have a system for each LLM that Dolittle uses: one system for LLaMA, another for GPT, etc.

I understand the definition of "system" is a moving target, so we might end up forcing the system abstraction to accommodate very different things.

smcgregor commented 1 year ago

Something I don't like about this approach is that we might push the taxonomy concept too far. Technically we could make everything a taxonomy, and right now, querying taxonomies doesn't has the best experience because attributes values are serialized (this is fixable with a custom graphql endpoint)

It sounds like the custom graphql may take care of the rough points? Is there a blog post to read about it? I am thinking about how we could move more of the taxonomy definition into UI in the future rather than something that requires engineering support.

cesarvarela commented 1 year ago

Displaying classification data associated with reports

Not sure about the icon, but it is easy to change anyway. Right now, we are using FontAwesome's free pack: https://fontawesome.com/v5/search?q=organization&o=r&m=free
On hover, a tooltip is shown with the "View classifications" message

Clicking on the icon brings a modal with the classifications associated with the report:

Clicking on the external link icon brings the report page with all its classifications the same way we do with incidents:

ping @smcgregor @kepae

kepae commented 1 year ago

I like the look. What do you think about making the icon blue, like other links/interactable pieces in the report component?

I think designing the actual summary page experience will be more involved and have more opinions. :-)

smcgregor commented 1 year ago

+1 to @kepae.

Displaying the icon only when classifications exist?

cesarvarela commented 1 year ago

Mockup for the reports discover:

Incidents and Reports that match the query are listed by doing an OR of the items that match what is set for each taxonomy component and an AND for each attribute added inside each taxonomy component

Clicking on the Add taxonomy button shows a modal that lets you add taxonomies to the current query:

Clicking on the Add attribute button shows a modal that lets you add attributes to the current taxonomy:

ping @smcgregor @kepae

kepae commented 1 year ago

Awesome, this matched my intuition for a display and it's nice to look at something real. I have a few questions that come to mind, and they relate to giving more query power to the user and how "query-able" the representation of taxonomies are generally.

1) The user might wish to execute an OR query among fields/attributes within a particular taxonomy. This can help with categorical values (e.g. in CSET, sector of deployment: transportation OR law enforcement). Similarly, a user might be interested in an intersection (AND) between taxonomies, isolating events that are similarly defined in two different taxonomies (e.g. CSET && GMF ). Can these elements of the query be customized, or why should we fix them? (there's research methodology questions with using two unrelated taxonomies, but it could be helpful when they address different perspectives...)

2) How can we express negations in the query UI? Especially useful for categorical values (e.g. NOT transportation, hate speech detection).

3) How are ordinal values to be stored in taxonomies and then queried here? For example, consider a toy example "severity of harm" attribute that is an ordinal metric with the values: none, minimal, and major. A user may wish to query for incidents or reports that represent at least a minimal severity of harm, which would in this case include minimal && major severities. Radio buttons would not suffice. The most desirable options would be to have an interface that collects the ordinal values for a taxonomy attribute and displays a range-like selector. However: I have to look into the current taxonomies and see if that is even reasonable to query currently. (It almost certainly will be a desirable query in the future and we should support taxonomies that have ordinal metrics.)

If we don't have ordinal metrics in taxonomies, this isn't blocking but is a future pain point.

I'm going to think more about how each taxonomy attribute "type" can/should be queried and if taxonomy schema currently support that.

kepae commented 1 year ago

Other points from today:

the query section should minimize easily to put focus on the results without scrolling
the goal is to surface incidents and separate issue reports to start, not reports that are underlying the same incidents. So the query should by default (maybe explicitly give control to the user) about filter reports that are used to substantiate incidents already present in the results. This should not filter reports that re used to substantiate other incidents not returned in the query.

This kind of filtering also supports a need for having a custom resolver... Let's talk more about this and see what the minimal backend changes would have to be.

cesarvarela commented 12 months ago

Deploy preview:

https://deploy-preview-56--cesarvarela-staging.netlify.app/apps/systems/

This is an example with filters already set: https://deploy-preview-56--cesarvarela-staging.netlify.app/apps/systems/?filters[0][type]=taxonomy&filters[0][config][namespace]=CSETv0&filters[0][config][query][combinator]=and&filters[0][config][query][rules][0][field]=Annotator&filters[0][config][query][rules][0][operator]=%3D&filters[0][config][query][rules][0][value]=1&filters[0][initialized]=true

The query builder component should be replaced with custom components; otherwise, this is the same as the Discover app. These components are probably dependent on the taxonomy that it is being added, and maybe some shouldn't be based on only one taxonomy
The algorithm for encoding/decoding the current filters to the query string should be improved so it doesn't exceed Chrome's 2048 chars limit.

I'll work on #2281 next, so there is better data to play around

cesarvarela commented 11 months ago

New update, now we can point finges to ChatGPT:

https://deploy-preview-56--cesarvarela-staging.netlify.app/apps/systems/?filters=%5B%7B%22type%22%3A%22taxonomy%22%2C%22config%22%3A%7B%22namespace%22%3A%22AILD%22%2C%22query%22%3A%7B%22combinator%22%3A%22or%22%2C%22rules%22%3A%5B%7B%22field%22%3A%22Name%20of%20Algorithm%20List%22%2C%22operator%22%3A%22in%22%2C%22value%22%3A%5B%22Bard%22%2C%22ChatGPT%22%2C%22Copilot%22%5D%7D%5D%7D%7D%7D%5D

responsible-ai-collaborative / aiid

Issue, Hazard, Risk, Vulnerability, Audits, and Controversy Support #2047

Example Applied to a LLM

Required Functionality in Codebase

2292

2281

Many flowers are blooming. We look to make a bouquet.

Displaying classification data associated with reports