swentel / activitypub

ActivityPub for Drupal
GNU General Public License v2.0
20 stars 3 forks source link

Ensure field data is normalized in posts #6

Open nedjo opened 4 years ago

nedjo commented 4 years ago

Rather than serving raw data, field data on objects such as an Article needs to be normalized. Examples: a text field with a filter; a text field that should be reduced in length; an image field that should use an image style.

Options

View mode + field formatters

This is the approach coded into core for RSS. It kinda works, but is old school. It also relies on various special casing in core such as this instance in NodePreviewForm::buldForm():

    // Unset view modes that are not used in the front end.
    unset($view_mode_options['default']);
    unset($view_mode_options['rss']);
    unset($view_mode_options['search_index']);

Normalization

Using Drupal core's normalization (per Symfony) could have a lot of advantages. The needs here are not so different from those of other web services, such as JSON:API, where core's normalization + serialization is used.

But it would be super useful to be able to leverage the landrok/activitypub package, and though it does use some other Symfony pieces it doesn't use normalization + serialization.

Plugins

We could use a plugin type that includes processing.

ekes commented 4 years ago

If I understand here this is the representation of the object in Activity vocabulary for ActivityStreams and so? Or I might be misunderstanding? But I'll try and put my thoughts here:

At an object level this would be things like a comment as a Note https://www.w3.org/TR/activitystreams-vocabulary/#dfn-note. A node of bundle type 'article' (or maybe the admin called it 'news') as an Article , a node of bundle type 'event', or 'happening', as Event https://www.w3.org/TR/activitystreams-vocabulary/#dfn-event... and so forth.

Each of these have their fields to be represented with the correct vocabulary name properties, and potentially reference other entities, which can also be included as Linked objects. Like for the probably most common example a Drupal node of type article will include an entity reference to a media entity of type image with a file; this would become in ActivityPub an Article including a Link to an object of type Image. https://www.w3.org/TR/activitystreams-core/#example-3

The property names however don't need to be, and in the real world already aren't, in the "https://www.w3.org/ns/activitystreams" namespace. The @context of the json-ld holds multiple namespaces. So just to talk to Mastodon you have namespaces added, and properties from them defined:-

"@context": [
  "https://www.w3.org/ns/activitystreams",
  "https://w3id.org/security/v1",
  {
    "manuallyApprovesFollowers": "as:manuallyApprovesFollowers",
    "sensitive": "as:sensitive",
    "movedTo": {
      "@id": "as:movedTo",
      "@type": "@id"
    },
    "Hashtag": "as:Hashtag",
    "ostatus": "http://ostatus.org#",
    "atomUri": "ostatus:atomUri",
    "inReplyToAtomUri": "ostatus:inReplyToAtomUri",
    "conversation": "ostatus:conversation",
    "toot": "http://joinmastodon.org/ns#",
    "Emoji": "toot:Emoji",
    "focalPoint": {
      "@container": "@list",
      "@id": "toot:focalPoint"
    },
    "featured": {
      "@id": "toot:featured",
      "@type": "@id"
    },
    "schema": "http://schema.org#",
    "PropertyValue": "schema:PropertyValue",
    "value": "schema:value"
  }
],

Now I'm not for a minute thinking of trying to support all or any of these in particular. In my D7 instance I've also hard coded the mappings of field to property and namespaces - because I can it's just for that site.

For a general solution it seems that adding the properties (and related namespaces) and allowing rdf module to do its magic with as:https://www.w3.org/ns/activitystreams as the default would be the way to go. It allows for simply mapping field title to as:title; field body to as:content; and so for a set of defaults; but allowing the site builder to extend them depending on which entity references to documents, or images, or dates or whatever they are adding. And even allowing for supporting other namespaces should it get funky... but not something to ship out of the box.