w3c / activitystreams

Activity Streams 2.0
https://www.w3.org/TR/activitystreams-core/
Other
284 stars 60 forks source link

Activity Streams should allow to state activities should not be tracked (robots.txt) #426

Closed akihikodaki closed 1 year ago

akihikodaki commented 7 years ago

Please Indicate One:

Please Describe the Issue: Mastodon implemented a feature to set up robots meta tag for HTML representations of objects (https://github.com/tootsuite/mastodon/issues/1599). That controls behaviors of robots on the Web. However, it is also an ActivityPub application, and robots could exist in the federation. Those bots could not understand such intention.

Activity Streams should allow to state that activities should not be tracked by robots to solve the issue. My suggestion is to extend Activity Vocabulary by adding robots property to the object. The value could be same or similar to the content of robots meta tag of HTML.

strugee commented 7 years ago

This has been discussed before in the ActivityPub issue tracker. I believe https://github.com/w3c/activitypub/issues/221#issuecomment-300205759 represents the consensus of the working group, although it could be just Evan. Either way I suspect his answer will be identical here.

akihikodaki commented 7 years ago

Sorry, I missed the issue. That is exactly the problem I want to address. However I have some arguments to support this idea rather than using audience, and because of that, I thought Activity Streams rather than ActivityPub should be extended and opened this issue.

  1. audience could not represent partial restrictions of robots meta tag and robots.txt.

The standard shows the following restrictions:

They are different restrictions, and the page administrator can show partial restrictions by choosing directives to include in the meta tag or robots.txt. For example, only noindex means robots can follow links in the page. That is exactly what Mastodon does. (see https://github.com/tootsuite/mastodon/pull/4199.) In such cases, robots are still in audience of the page.

  1. Compatibility with robots meta tag

We can have better compatibility by having robots property with similar content to robots meta tag. Compatibility matters because Activity Streams applications could often be Web applications as well.

  1. robots is suited for the standard while audience is more dependent on implementations.

Activity Streams does not define the content of audience, and it could be more dependent on implementations. However, robots property could be a standard as robots.txt is a de facto standard.

gobengo commented 6 years ago

This is a cool idea, but donno if it should be a long-standing open issue here.

If I were you and still need this, I'd write a short document explaining this (copy-paste?) and host it as https://mastodon.social/activitystreams-extensions/robots .

Anyone can then add 'robots' to their JSON objects by defining it in the @context.

evanp commented 1 year ago

This is an interesting idea. It's also an area of a lot of conversation in the fediverse. It's not currently part of AS2, so it would need to be an extension. That's something well-documented in the AS2 core document:

https://www.w3.org/TR/activitystreams-core/#extensibility

We do have a list of well-known extensions, so if this is widely used, we should probably include it.

For now, I'm going to close this issue, with the recommendation that a new extension vocabulary be added.