go-gitea / gitea

Git with a cup of tea! Painless self-hosted all-in-one software development service, including Git hosting, code review, team collaboration, package registry and CI/CD
https://gitea.com
MIT License
43.53k stars 5.35k forks source link

Design Discussion: ActivityPub support + ForgeFed vocabulary #14186

Open cjslep opened 3 years ago

cjslep commented 3 years ago

1612 discusses Federation in general but I wanted to open an issue for the ActivityPub + ForgeFed solution specifically and concretize this unit of work. Let's keep to discussing ActivityPub+ForgeFed design specifics here, and have the question "whether it should be ActivityPub+ForgeFed at all" discussed in the other issue (#1612).

Background

I work on the go-fed suite of ActivityPub related libraries, sites, and tools. So my knowledge is more centered on the AP/ForgeFed angle, but as I've only spent light amount of time with the Gitea code I'm not comfortable making the changes required, especially w/o without serious design discussions. And I'm not here to shill go-fed as being the solution, it just provides a solution, as there are reasons to pick it entirely, partly, or not at all. I'll try to keep the evangelization to a minimum and self-contained, at the very end.

(Optional, Time-Sensitive) Grant Opportunity

There is a limited-time opportunity for this work to be submitted as a grant to the NLNet folks via the NGI Search & Discovery grant (5-50k euros) or DAPSI grants (50k+ euros?). There is never a guarantee a grant will be selected to receive money, but given the slate of other Fediverse projects that have gotten funding via the NGI S&D, I think this is a great exercise. There is the possibility that the EU will extend certain NGI funding periods for further cycles, but it is not guaranteed.

Concretely, would need 1 or more Gitea community member volunteers interested in taking the lead on those. I personally am applying separately for other projects, so I don't have time / energy to push this aspect forward, but happy to provide guidance where possible.

Community Standards

I believe the ForgeFed work is still ongoing. An outcome could be that this effort allows whoever wants to help to pioneer additional ForgeFed behavior and be a voice there. Additionally, depending how Gitea embraces ActivityPub in general, it may also have opportunities to create Fediverse Enhancement Proposals, so no matter what the volunteers will definitely get open-source community leadership opportunities.

Design

Overview

ActivityPub is based on the concept of actors exchanging activities. These activities tell federated peers how to update their view of the "Federated Web", which is a linked-data graph composed of different RDF ontologies. That's a jargon-y way of saying "data types are flexible, and have pointers to other pieces of data". ForgeFed is just one ontology focusing on Forge behaviors and entities. Peers are not expected to know how to interpret every single ontology on this Federated Web, so Gitea can just focus on a narrow one -- ForgeFed -- in addition to the basic ActivityStreams vocabulary that acts as a common language.

This does not prevent Gitea from adopting different ontologies later, if the project decided to support viewing/interacting with the other kinds of activities going on in the wider web. It is just not a requirement right now, in the spirit of keeping scope limited.

The ForgeFed spec outlines the actors and the data being exchanged. A subset of that data are activites which are shareable between actors. One actor can Create a Ticket (issue) and give it to a peer, who knows: "this data -- which happens to be an activity -- is a Create so I'll invoke my create behavior/function with the payload".

So if REST is...

POST to /issues/new with the body containing payload and a session_id containing authenticated credentials. This results in invoking the server's CreateIssue function with payload based on the user calling it.

...where REST scales by just creating more endpoints and using more HTTP verbs: POST and GET and PATCH to /repo, /merge-request, /repo

Then ActivityPub is...

POST to /actors/cj/inbox with an http_signature header and Activity payload. This results in invoking the server's WhatActivityIsThis function which determines the Activity is a Create, so it calls the CreateIssue function with the rest of the payload information based on the federated_peer_user calling it.

... where ActivityPub scales by just having new types of Activities (Create, Update, Offer, etc) and new data types that are acted upon (Person, Repository, Commit, Note, etc).

This means Gitea adopting ActivityPub will require a little bit of a different philosophical mindset than perhaps is common. Rectifying that, or isolating that, with the existing codebase is a core engineering challenge.

Therefore, at a high level, Gitea would need to support the following concepts:

...then can "First Federated Behavior" be reasoned about, as a penultimate section.

Let's concretely dive into what is required to do them. The "why"s might not come together until the "Fetching ActivityStreams" section. These sections are the things I can think off of the top of my head, it may be incomplete.

Note: This only goes into S2S federation and not C2S federation

Actors

ForgeFed only has suggested guidance for actors:

To support any of these, the following would need to be tackled:

Sending Activities

Sending activities is described in the ActivityPub spec. Unfortunately, it also explicitly relies on the C2S addressing, which must be kept in mind while implementing.

This both unblocks Gitea's ability to do any federated flow in the future, but alone is insufficient to do any specific federated flow.

To keep it succinct:

Receiving Activities

Receiving activities is the second half:

Serving ActivityStreams

As a consequence of the aforementioned section of "having Actors do things", they will generate data that needs to have an ActivityStreams representation so that peers, upon looking at what an actor has been up to, can natively understand what is going on. For Gitea specifically, this gets into the ForgeFed examples.

This unblocks the next bit...

Fetching ActivityStreams

If I ever use the term "dereferencing", this is what I mean.

A Gitea instance will be able to fetch a peer Gitea instance's ActivityStreams, thanks to the work outlined in the previous section. This allows you on foo.gitea.io to fetch my Person actor on bar.gitea.io but, say, render it on a webpage to yourself natively on foo.gitea.io. Dereferencing is needed for other operations, in particular the Delivery and Addressing portions of "Sending an Activity".

Fetching ActivityStreams data also allows foo.gitea.io to potentially display all the information of a Repository shown on bar.gitea.io, without having to actually navigate to bar.gitea.io. Any actions done by users would spawn new Activities and resulting in invoking the "Sending Activities" section. Again, that's up to the concrete design.

First Federated Behavior

All of the above, and we haven't yet discussed the behaviors unlocked by ForgeFed yet. They list several:

I would propose just aiming for one initially. Even aiming for none of these, but doing the other sections above, is a large enough feat worth celebrating: Getting to the point where the followers flow works is a celebratory moment, if Gitea wants to have the concepts of "followers"/"following" (and I think it does?).

This first federated behavior would involve:

Whew, done. :)

Go-Fed

I promised to keep this at the end and self-contained. :) The go-fed/activity library focuses on being middleware. You implement several interfaces like Database which it will use. You then use the resulting library calls in a http.Handler to deal with "Actors sending Activities" or "serve this ActivityStreams representation".

Since it is middleware, it only solves some of the problems I listed above. Big picture, the main problems remaining unsolved by go-fed are the integration ones:

More specifically, going section-by-section and listing the bullets that are addressed by go-fed:

The go-fed/activity library does not solve:

Finally, some downsides of go-fed:

I hope this kicks off a productive discussion. Thanks for reading this far, if you made it. :)

cjslep commented 3 years ago

I also opened a discussion on Forgefed where Bill helpfully pointed out I did not reference #9045. That issue is locked to collaborators so I am not able to post there, but I think a lot of the technical discussion over there is relevant to this issue, so please check it out.

Wishing everyone a happy New Year and guten Rutsch ins neue Jahr!

pilou- commented 3 years ago

Posted on behalf of Loic, as explained in the post scriptum (original message on the Fedeproxy forum).


Bonjour,

TL;DR: what would it take to put federation on the Gitea 1.16.x roadmap?

The gitea roadmap for 1.1.15 is now locked and the next release cycle will start right after it is released. However the roadmap for 1.1.16 does not include the implementation of ActivityPub.

What would it take to make that happen?

It is of course a matter of who is interested to implement it and has time and skills to actually do it. An area where the fedeproxy project could help by redirecting 5,000€ of its funds (total 75K€). This is not much but it's not nothing and could be used to make the first baby step(s).

It is also a matter of defining the first baby steps that would lead Gitea in the direction of federation. At the moment there only is a very high level discussion that can hardly be translated into actionable items. In your expert opinion, what would be the first minimal tasks that would make most sense?

Cheers

P.S: I'm very motivated to help move forward and in case you're curious why I don't post myself (thank you Pierre-Louis for being my proxy :-) ), feel free to read why I deleted my GItHub account a few years ago.

techknowlogick commented 3 years ago

I should note that Loic had reached out to me (and likely a few other forge maintainers) about this in the past via email, and I apologize for not responding. To quote Jeff Goldblum, life uhh.. gets in the way (I know this isn't the actual quote)

the next release cycle will start right after it is released.

Technically 1.16.x work has already started, but that's due to the feature freeze with 1.15.x. Although the sooner a PR is created the more likely it is to get into the 1.16.x milestone (see note about creating the PR below).

What would it take to make that happen?

To be frank, it's a matter of finding a developer to work on this (be it a maintainer of Gitea, or someone from outside the project). As you do have funding, I suspect that might be easier, as in the past few months over 1K USD has been collected as bounties working on various PRs (I personally have put up some funds as bounties). So there are developers who would likely be willing (and capable) to take on this work. If you were to use these funds I would recommend giving them directly to the developer working on this PR, and the maintainers reviewing it, rather than to the project itself (although of course speaking on behalf of the project we'd be happy/thankful to accept funds, but donations to the project don't necessarily guarantee that specific tasks would be completed).

A note: as this is likely a large PR, I recommend the developer who works on it first discuss the approach of integrating this into gitea with the project (this is to prevent a large PR from being completed and perhaps some foundational work of the PR would be suggested to be done differently, so the developer would then potentially need to change a significant amount of it).

cc: @KN4CK3R and @adelowo in case you'd be interested in paid work on Gitea

aschrijver commented 3 years ago

FYI. On fediverse this recent development was enthusiastically shared and on Lemmy (a federated Reddit) a core dev said they'd move issue tracking from Github with this in place, while someone else said 'where can I donate?'.

This last bit is not really clear, plus there may be additional opportunities to get more funding to realize this functionality. And also ForgeFed may still have funds available.

pilou- commented 3 years ago

Posted on behalf of Loic.


@techknowlogick thanks for your reply and guidance :-) It's good to know the timing is right and I'm hopeful someone will be interested.

you were to use these funds I would recommend giving them directly to the developer working on this PR, and the maintainers reviewing it, rather than to the project itself

I'll follow your advice. Since fedeproxy is horizontal (no organization) the funds originate from individuals (Pierre-Louis and myself, 50% each) and it will be possible to pay the person(s) doing the work directly.

as this is likely a large PR...

Maybe it can be broken down in smaller tasks / PRs? It would be easier to review and implement. And it will also be easier to prioritize which task should be worked on first and which ones can fit is the modest budget there is.

cjslep commented 3 years ago

There was a small meeting today between zeripath, myself, and Loic around building towards a future where federation work has a grant fund people to do it (to be clear: not necessarily funding us three, it could be funding one or two of us + others in the community). On the socialhub was the agenda. In the interest of transparency, the meeting was recorded which should also shortly be available there as well.

Since the scope of work is fairly large and the goals rather ambitious, we definitely want to be as transparent as possible w/ the wider Gitea community and welcome folks to feel free to step up & participate if you, dear reader, would like.

lunny commented 3 years ago

If somebody want to start the work from v1.16, it's a good start point to add comment on #16429 .

aschrijver commented 3 years ago

Watch this interesting talk about funding, grants and more technical background information (published with PeerTube: federated video streaming) between @dachary (of FedeProxy project), @cjslep (of GoFed project) and @zeripath (Gitea maintainer).

pilou- commented 3 years ago

Posted on behalf of Loic


Bonjour,

While working on the grant application today, I ran into a question that is probably worth discussing before moving forward. Go-fed is AGPLv3+, and Gitea is MIT. Adding Go-fed as a dependency of Gitea means that Gitea, as a whole (meaning Gitea+Go-fed+other dependencies), can only be released under the AGPLv3+. Or not released at all.

To be more precise, here is the minimal set of actions required to distribute a gitea executable that contains Go-fed:

This is not the only possible course of action, only the simplest.

What do people think about this?

zeripath commented 3 years ago

Damn that makes integrating go-fed not possible.

csolisr commented 3 years ago

Slight correction: The implementation of APCore is what is licensed AGPL, which is itself an extension of the standalone Go-Fed Activity libraries, which are licensed under the BSD 3-clause license instead. So what we would have to do would be to reimplement at least the relevant parts of APCore.

aschrijver commented 3 years ago

I think there's no real issue here, as apcore is meant as the opinionated batteries-included server framework you'd use when creating a stand-alone AP server and save you a bunch of work. Since this issue was created, a new AP server project GoToSocial was created with Go-Fed and they chose to build on top of go-fed/activity and go-fed/httpsig, not apcore. This is what Gitea would also need to do, as @csolisr states.

Note that both @cjslep and @tsmethurst (of GoToSocial) can be found in the Go-Fed Matrix chatroom at https://matrix.to/#/#go-fed:feneas.org

pilou- commented 3 years ago

Posted on behalf of Loic


Oh… my bad, thanks for the clarification!

cjslep commented 3 years ago

I want to confirm what @aschrijver said. go-fed/activity and other low-level libraries are intended to be as permissive as possible, as they are narrowly scoped in implementing ActivityStreams and ActivityPub. (I licensed it permissively because I believe protocol implementations should be without any "gotchas", whether for proprietary use or for free-as-in-liberty use or for ... etc. For ex: at one point write.as may have been using this library, there's no problem with that).

The go-fed/apcore package builds on top of that to make a heavily opinionated server framework that is intended for new projects that use ActivityStreams internally as fundamental database structures. On technical merit alone, I do not recommend it for Gitea in the slightest. Since it is more than just a protocol implementation and is meant to bootstrap user-facing applications, I licensed it in the spirit of GPL to give an advantage to applications that are free-as-in-liberty.

I think Gitea's infrastructure is sufficiently different from APCore that there's nothing really lost by not using it. Since they have different technical approaches, I think Gitea's integration with go-fed/activity will look very different than anything in APCore anyway, so there's not much being missed out on.

DanielMowitz commented 3 years ago

In the last few weeks I looked throught the initial comment in this issue and tried to create a plan outlining what the implementation of ForgeFed could look like. The idea was to aid the discussion of the actual code that needs to be written, and not to have a 100% correct roadmap. I hope my proposals are at least somewhat sensible, as I do not have too much experience with either AP/ForgeFed or the Gitea codebase.

Mapping ForgeFed concepts to Gitea

By comparing the ForgeFed Actor guidance to Giteas modules, I came to the conclusion that the following mapping should be reasonable:

AP Gitea
Person User
Project Project
Repository Repo
Group/Organization/Team Org, Team

Preparing the data model

All modules mentioned in the table above should have the following fields added:

An instance-wide list of federated/blocked instances needs to be added to the database aswell.

The "translation layer" mentioned by @cjslep should be rather straightforward, as the forgefed spec is minimal and gitea has most of the fields already. Some things like "Description" -> "summary" need to be kept in mind though.

Implementing ActivityPub behaviour

From this point onward there are two possible routes:

Implementing it all by hand

When not using a library to abstract the ActivityPub functions, this would be my proposed way of adding federated behaviour to Gitea:

The following is an example of a repos json representation:

{
    "@context": [
        "https://www.w3.org/ns/activitystreams",
        "https://w3id.org/security/v1",
        "https://forgefed.peers.community/ns"
    ],
    "id": "https://dev.example/aviva/treesim",
    "type": "Repository",
    "publicKey": {
        "id": "https://dev.example/aviva/treesim#main-key",
        "owner": "https://dev.example/aviva/treesim",
        "publicKeyPem": "-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhki....."
    },
    "inbox": "https://dev.example/aviva/treesim/inbox",
    "outbox": "https://dev.example/aviva/treesim/outbox",
    "followers": "https://dev.example/aviva/treesim/followers",
    "team": "https://dev.example/aviva/treesim/team",
    "name": "Tree Growth 3D Simulation",
    "summary": "<p>Tree growth 3D simulator for my nature exploration game</p>"
}

Using Go-Fed

The equivalent of writing an actor interface would be implementing the db interface from go-fed. The other two points would then be met by implementing the FederatingProtocol for all actor classes (http signing is possible using go-fed/httpsig).

Fetching ActivityStreams

In whatever way it is implemented, at this point Gitea would be able to send and receive Activities as they happen, but it would be nice to also see all the Actors and their activities that came before. To do this, Gitea would have to do the following when a user requests Data about an Actor:

It might be a good idea to add a UI element to force a fetch of remote data, so users can get up-to-date information when needed.

DanielMowitz commented 3 years ago

As an aside, from what I read, it seems like SharedInbox would help in generating a timeline of all public activities. To me this seems like a desirable feature for a ForgeFed implementation. My knowledge of AP is not deep enough to have a really meaningful opinion on the topic though and I would love to find out more! @cjslep, this is not really the right place to discuss this, but I would be really interested in what your issue with SharedInbox is, maybe we can find some place to have that discussion.

aschrijver commented 3 years ago

Thanks for your elaboration. It is great to see the growing interest in federating Gitea!

@cjslep, this is not really the right place to discuss this, but I would be really interested in what your issue with SharedInbox is, maybe we can find some place to have that discussion.

The most appropriate location would be the SocialHub community forum, where there's some prior discussion already, like this topic.

pilou- commented 2 years ago

Posted on behalf of Loic


Loic wrote:

It is also a matter of defining the first baby steps that would lead Gitea in the direction of federation. At the moment there only is a very high level discussion that can hardly be translated into actionable items. In your expert opinion, what would be the first minimal tasks that would make most sense?

@zeripath @cjslep here is an idea for the first baby step: creating a keypair for every user that will be used to sign http requests (see also the IETF draft). Although signing http requests is not required by ActivityPub or ActivityStreams, it is mentioned in the ActivityPub wikii. Since mastodon, bookwyrm and other ActivityPub implementations verify and expect a signature, it is de facto required.

If it sounds sensible to you, I think the creation of the corresponding issue as well as its implementation is eligible to be funded by the 5,000€ grant made available by the fedeproxy project a month ago:

thebiblelover7 commented 2 years ago

Any news on this? So exited to have this when it is done!

techknowlogick commented 2 years ago

@thebiblelover7 please see the pinned issue https://github.com/go-gitea/gitea/issues/16827

rosariop commented 10 months ago

is this still active? I am interested in build on a fediverse api build in golang and I'd be willing to support implementing it in the first place too!

ghost commented 10 months ago

is this still active? I am interested in build on a fediverse api build in golang and I'd be willing to support implementing it in the first place too!

I don't think any of the Gitea developers are actively working on federation, but there's been some progress in implementing it in Forgejo. I've personally been busy with other stuff for the past year or so, but I'm hoping to getting back to working on federation sometime in the next month.

techknowlogick commented 10 months ago

@Ta180m I'm still sending PRs following my original plan, but as this is a large undertaking it's being done in small parts over time as large PRs make things difficult to review.