w3c / activitypub

http://w3c.github.io/activitypub/
Other
1.24k stars 78 forks source link

Add a reference to AS2 Core's requirement that documents must be compacted #359

Open trwnh opened 1 year ago

trwnh commented 1 year ago

Problem statement in the existing specification

Currently the ActivityPub specification and its examples are written with an implicit assumption that documents will be compacted against the context http://www.w3.org/ns/activitystreams, but this is not actually stated anywhere in the specification.

EDIT: This statement is instead made in ActivityStreams Core: https://www.w3.org/TR/activitystreams-core/#jsonld

The serialized JSON form of an Activity Streams 2.0 document MUST be consistent with what would be produced by the standard JSON-LD 1.0 Processing Algorithms and API [JSON-LD-API] Compaction Algorithm using, at least, the normative JSON-LD @context definition provided here. Implementations MAY augment the provided @context with additional @context definitions but MUST NOT override or change the normative context.

A hint linking back to this section of AS2 should be added to the AP spec, probably under Section 1 Overview and/or Section 3 Objects.

Original issue text The only existing mention of JSON-LD was added in #102 under `3. Objects`: > ActivityPub defines some terms in addition to those provided by ActivityStreams. These terms are provided in the ActivityPub [JSON-LD context](http://www.w3.org/TR/json-ld/#the-context) at https://www.w3.org/ns/activitystreams. Implementers SHOULD include the ActivityPub context in their object definitions. Implementers MAY include additional context as appropriate. `1. Overview` also has this to say: > ActivityStreams can be extended via [[JSON-LD](https://www.w3.org/TR/activitypub/#bib-JSON-LD)]. If you know what JSON-LD is, you can take advantage of the cool linked data approaches provided by JSON-LD. If you don't, don't worry, JSON-LD documents and ActivityStreams can be understood as plain old simple JSON. (If you're going to add extensions, that's the point at which JSON-LD really helps you out). A note under `5.6 Public addressing` hints at the difficulties and challenges that "simply JSON" implementations will face: > Compacting an ActivityStreams object using the ActivityStreams JSON-LD context might result in https://www.w3.org/ns/activitystreams#Public being represented as simply Public or as:Public which are valid representations of the Public collection. Implementations which treat ActivityStreams objects as simply JSON rather than converting an incoming activity over to a local context using JSON-LD tooling should be aware of this and should be prepared to accept all three representations. ## Explanation of issues for non-LD-aware implementations Someone reading the ActivityPub spec without any JSON-LD knowledge would only be made aware of this issue in reference to the Public magic collection, as per the note in `5.6 Public addressing`. Thus, they may implement their parser to understand "all three representations" as the spec told them to. But the following examples would be unaccounted for, and would all break a "plain JSON only" interpretation: ### Alternate representations for terms and values other than `Public` You can use the full IRI form, shorthand with a prefix, or shorthand without a prefix... for any term or value in the document. What's more, you can even mix them! Caveats: - You cannot use an IRI () or a compact IRI (as:type) for `type` or `id`. You must use the terms as defined (or use the JSON-LD keywords they map to). - Using a shorthand prefix will cause `@type` information to be stripped, and this causes compaction to leave the prefix in - Especially, you cannot use a shorthand prefix for any property with a `@type` of `@id` instead of one that implies `@value` (unless you are inlining the object). JSON-LD processors will not apply context mapping to raw values. ### Using `@type` and `@id` directly `as:type` will not translate to `@type` but to `https://www.w3.org/ns/activitystreams#type` (which does not get mapped to `@type` because it is not defined as such in the activitystreams.jsonld context). But no matter, we can throw a raw `@type` in there to further confuse JSON-only implementations. We can also throw a raw `@id` if we wanted to. ### Custom shorthands defined in `@context` For even more confusion, we could define a custom shorthand prefix! Arguably this one is a bit of a stretch, because it could be said that this is somewhat of an "extension", but it doesn't have to be. Consider an entry in `@context` which maps `activitystreams` similarly to how `as` already maps to `https://www.w3.org/ns/activitystreams#` as a base IRI. Or, perhaps it even maps directly onto `as`. You may say there is no reason to map this custom shorthand, but what if we were providing our own context document instead of the one hosted on w3.org, albeit with the same definitions and mappings to the same base IRI? ## An example of technically valid JSON-LD that a non-LD-aware "plain JSON" implementation would struggle to comprehend ```json { "@context": [ "https://www.w3.org/ns/activitystreams", { "activitystreams": "as", "as:to": { "@type": "@id" } } ], "@id": "https://example.com/some-activity", "type": "as:Create", "activitystreams:object": { "id": "https://example.com/some-object", "@type": "Note", "https://www.w3.org/ns/activitystreams#content": "hello world", "as:published": { "@value": "2022-12-15", "type": "http://www.w3.org/2001/XMLSchema#dateTime" } }, "as:to": { "@id": "activitystreams:Public" }, "to": "as:Public" } ``` This is not something a JSON parser should ever have to deal with. Clearly, the LD-aware implementation MUST take extra steps to produce a document that is understood in a consistent way by non-LD-aware implementations. ## Recommended changes to the specification - "For any ActivityPub object that is delivered or dereferenced, LD-aware implementations MUST compact against `https://www.w3.org/ns/activitystreams` at minimum, and MAY include additional contexts during compaction." This reduces everything to shorthand where possible, making it easier for JSON-only impls to parse. ## Recommended changes to the activitystreams.jsonld context, for additional consistency - Add `"@container": "@set"` to all property definitions, except for functional properties This would save JSON consumers from having to account for single string values, forcing an array during compaction. Consumers would then only have to account for IRI representations vs inlined representations. It would also make it clearer which properties are functional and which ones are not. - Consider defining `endpoints` as a `@nest`? This one might be premature, as JSON-LD 1.1 support for the `@nest` keyword is required.
nightpool commented 1 year ago

Currently the ActivityPub specification and its examples are written with an implicit assumption that documents will be compacted against the context http://www.w3.org/ns/activitystreams, but this is not actually stated anywhere in the specification.

This is incorrect, this requirement is stated very clearly in the ActivityStreams spec, which is normatively referenced by the ActivityPub spec:

The serialized JSON form of an Activity Streams 2.0 document MUST be consistent with what would be produced by the standard JSON-LD 1.0 Processing Algorithms and API [JSON-LD-API] Compaction Algorithm using, at least, the normative JSON-LD @context definition provided here. Implementations MAY augment the provided @context with additional @context definitions but MUST NOT override or change the normative context.

Documents that are not thusly serialized are not conformant with the specification.

Using a shorthand prefix will cause @type information to be stripped, and this causes compaction to leave the prefix in

I'm not sure I understand your point here. Besides Public, what other shorthand prefixes could be used in the value-position? Regardless, as:Public is enforced by the compaction algorithm.

Using @type and @id directly

This is incorrect. The JSON-LD Compaction Algorithm (as referenced by the normative language above) requires that these be converted into their type and id aliases.

An example of technically valid JSON-LD that a non-LD-aware "plain JSON" implementation would struggle to comprehend

This is incorrect. This is not a "valid" ActivityStreams 2.0 document. Here is the ActivityStreams document you gave when it is properly compacted as is compliant with the AS2 spec:

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "id": "https://example.com/some-activity",
  "type": "Create",
  "object": {
    "id": "https://example.com/some-object",
    "type": "Note",
    "content": "hello world",
    "published": "2022-12-15"
  },
  "to": [
    "as:Public",
    "as:Public"
  ]
}

JSON-LD Playground link demonstrating this: https://tinyurl.com/2ktvsfbg

Add "@container": "@set" to all property definitions, except for functional properties

This isn't mentioned anywhere else in your issue (and I don't think it's related to the problem statement), and it may be worth its own separate ticket on the ActivityStreams repo, but it would be a major breaking change at this point, and I do not think the massive regression in backwards compatibility would be worth it for the minor value that it brings in terms of forward compatibility. I think at this point every major consumer is aware that properties can have array or value types, and new ones become aware of it very quickly in their development lifecycle. I don't think it's worth re-opening that discussion at this stage.

trwnh commented 1 year ago

Ah, I overlooked the requirement in ActivityStreams-Core, which does indeed resolve my concern.

In that case, perhaps adding a hint somewhere in ActivityPub linking back to that section of ActivityStreams would be helpful... probably in Section 1 and/or Section 3.

evanp commented 7 months ago

This makes sense; if people as well-versed in ActiivtyPub as @trwnh can find the requirements here confusing, we should add some clarification to point out this requirement. In particular, that AP documents need to meet all the requirements of AS2 documents, including compaction (called out).

However, that's mostly a stylistic detail, and doesn't rise to the level of an error in the text requiring an Erratum. Therefore, I think we should mark this request for the next version of AP, and consider it once we have chartered a working group.

trwnh commented 7 months ago

I filed this issue nearly two years ago, and I think the fundamental issue that people have is that it is not "enough" to read the AP spec, as one must also read AS2-Core. Yes, it's "normatively referenced", but it would still be useful to have a reminder of this outside of the "referenced" section. A one-liner sentence would work, something along the lines of "Please see AS2-Core for requirements regarding objects" under the aforementioned section 3. Right now there's a hint to look toward AS2-Vocab, but not an explicit hint to look toward AS2-Core. There's a normative SHOULD to include the context, but again no reference to the requirement that the output needs to be consistent against the output of compacting with that context.