ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
15.79k stars 2.96k forks source link

Deprecate then Remove /api/v0/pubsub/* RPC API and `ipfs pubsub` Commands #9717

Open Jorropo opened 1 year ago

Jorropo commented 1 year ago

This is about /api/v0/pubsub/*, and not IPNS-over-Pubsub.

Kubo's PubSub RPC API (/api/v0/pubsub/*) provides access to a somewhat reliable multicast protocol.

However it has many issues:

Lack of a real contract on the reliability of messages

The messages reliability contract is more or less:

  1. It's mostly reliable, without retransmission but using reliable transports and a flood like action. (AKA whatever go-libp2p-pubsub implements)
  2. Duplicate messages shouldn't be shared too much across the mesh and should die off after some time. (Attempt to avoid storms)

First point, go-libp2p-pubsub has a bunch of options and callbacks you can configure to change the behaviour of the mesh (view Examples of correct usage of go-libp2p-pubsub section below). The options are not currently configurable, it's not just a question of exposing them, the really impactfull options are the one locked behind callback like validators. Two potential solutions are to have clients maintain a websocket, grpc, ... stream open with Kubo, then when Kubo receives a callback from go-libp2p-pubsub it can forward the arguments over the API stream and wait for the response. Or add something like WASM to run custom application code inside Kubo, then you will be able to configure WASM blobs which implements the validator you want. This is much harder than just throwing a WASM interpreter and writing a few hundred SLOCs of glue code, because most validators you would want write would need to access and store some application related state. (for example in a CRDT application, do not relay messages that advertise a HEAD that is lower than the currently known HEAD).

Second point, our current implementation of message deduplication use a bounded cache to find duplicates, if the mesh gets wider than the cache size, you can reach an exponential broadcast storm like event: https://github.com/ipfs/kubo/issues/9665, sadly linking to point one, even tho the fix is supposed to be transparent and implement a visibly similar message deduping logic except it does not have a bounded size this make our interop tests very flaky and thus might break various stuff in the ecosystem.

Confusing Architecture

I had more than I have fingers discussions with various peoples who complain that Kubo pubsub does not work, they never receive messages. Almost always the issue is that they are running ipfs http clients in the browser, open two browser tabs, and then try to receive messages from the other tab. This does not work because Kubo does not think of the two clients as two clients, from Kubo's point of view the http api is remote controlling the Kubo node. Thus the fact that the browsers are different tabs are different browsers instance, is not taken into a count, as far as Kubo can see, the messages are sent by the same node (itself) and it does not return you your own message because receiving messages you sent yourself is confusing.

This is a perfectly valid usecase, just not what the API was designed to do (you can implement this is to use js-libp2p in the browser then your browser node would use floodsub to a local Kubo node, with messages going through the libp2p swarm instead of the HTTP API).

Future of the API

Currently the pubsub API is not in a good place and correctly advertise this:

EXPERIMENTAL FEATURE

    It is not intended in its current state to be used in a production
    environment.  To use, the daemon must be run with
    '--enable-pubsub-experiment'.

Our current team goals are to move away from the two ABIs (HTTP & Go) maintenance costs for people who want to build applications on top of IPFS by providing a consistent Go ABI story (go-libipfs) and a comprehensive example collection on how to use this (go-libipfs/examples). Fixing the PubSub API require lots of work which does not allign with theses goals and thus to not justify allocating much time on this when we enginer time is at a premium.

go-libp2p-pubsub's Go API is already competent and capable of satisfying the needs of consumers proven by the Production examples of correct usage of go-libp2p-pubsub section below. go-libp2p-pubsub will continue to stay part of libp2p given this really have very little to do with IPFS and can be used by any libp2p project (ETH2 for example).

Ways for creating a soft landing

To ease the pain of people currently using the PubSub HTTP API Kubo API, we could:

  1. create a new daemon binary that provide the same endpoints. That said, we wouldn't have plans to maintain it and it is most likely going to have the same issues as the current API. Someone else would need to pick it up to maintain it.
  2. As part of @ipfs/kubo-maintainer's example-driven development strategy, we could create missing examples on how to bootstrap a project using go-libp2p-pubsub if that is useful. (TBD where that example will live but could be something like full-example in libp2p/go-libp2p-pubub that is validated as part of CI)

Production examples of correct usage of go-libp2p-pubsub

For good example of how to use libp2p-pubsub's effectively see things like:

## Tasks
- [x] @Jorropo Close https://github.com/ipfs/kubo/issues/6621 (pointing to the issue above)
- [X] @Jorropo Create a topic in discuss.ipfs.tech (https://discuss.ipfs.tech/t/help-kubo-maintainers-about-usecases-for-the-http-pubsub-api/16097). Include that we are planning to remove pubsub (link to this issue) and "Please comment below to share your usecase and why you have used this in Kubo."
- [x] @Jorropo For Kubo 0.19, PR for moving pubsub from experimental to deprecated (referencing this issue). This should have a changelog update. Hide/remove all the docs about this as well since we don't want anyone else putting weight on this.  https://github.com/ipfs/kubo/pull/9718
- [ ] post 0.20 / IPFS Thing, create the migration path / soft landing so we can fully remove pubsub from Kubo.
- [ ] Remove pubsub from Kubo
BigLep commented 1 year ago

Thanks for creating this issue @Jorropo . I made some adjustments to the issue description including adding a formal task list. (Feel free to look at the changes in the issue history.)

2color commented 1 year ago

Thanks for the write up @Jorropo.

This makes total sense for all the reasons you mentioned.

Broadly speaking, we just need to do some better advocacy and education about PubSub in libp2p and establish some best practices from the known real world usecases you listed.

As far as I understand, this would deprecate the following endpoints:


Tagging @TheDiscordian as this would likely break Discochat, which relies on the Kubo-RPC client and a Kubo daemon to subscribe to topics.

It looks like it from a search of the code

Either way, we're already planning a new example to showcase universal connectivity with libp2p https://github.com/libp2p/universal-connectivity/issues/1 which showcases an app architecture where every user is a full libp2p Peer.

BigLep commented 1 year ago

Reopening since this isn't complete (only https://github.com/ipfs/kubo/pull/9718 is).

pinnaculum commented 1 year ago

As far as I understand, this would deprecate the following endpoints:

* https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-pubsub-ls

* https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-pubsub-peers

* https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-pubsub-pub

* https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-pubsub-sub

Tagging @TheDiscordian as this would likely break Discochat, which relies on the Kubo-RPC client and a Kubo daemon to subscribe to topics.

So kubo v0.19 will be the last kubo release to have the pubsub RPC API baked in, it seems. My project can't work without pubsub so i'll just freeze the kubo version for now. There's a post here that suggests it'll still be possible somehow to use pubsub via a separate repo, any info on that ?

Jorropo commented 1 year ago

@pinnaculum this is point number 1 in Ways for creating a soft landing in the issue above.

Ideally what we want is that you write ~100 lines of Go and use go-libp2p-pubsub because this gives you access to the lower level callbacks which lets you configure your pubsub mesh correctly (with message validation and not relaying outdated messages).

pinnaculum commented 1 year ago

Ideally what we want is that you write ~100 lines of Go and use go-libp2p-pubsub because this gives you access to the lower level callbacks which lets you configure your pubsub mesh correctly (with message validation and not relaying outdated messages).

Merci @Jorropo :) Yes, that works if you're writing in Go. I've read the "chat" example in go-libp2p-pubsub, but since your program runs outside of kubo and has to be written in Go, this makes things more difficult for a project that wants to use pubsub and the rest of kubo's APIs. IPFS's Pubsub on its own is truly great (hooray gossipsub), but that's when you can combine it with the rest of the IPFS APIs that it can truly shine IMO, because there are many other pubsub implementations out there i think.

The beauty of having an integrated pubsub RPC API is that you can write in any language that has bindings for kubo's RPC, without knowing (or caring, tbh) about the internals of all this, and take advantage of the rest of the IPFS API.

You guys are the experts and i understand that it takes a lot of time to maintain and fix all of this code. I've used the pubsub RPC API since the go-ipfs 0.4.x days until the recent kubo releases and it has steadily improved, so thank you and congratulations to the geniuses involved in this.

Jorropo commented 1 year ago

@pinnaculum I'm not saying that you would rewrite your complete app in Go, I'm saying that you would write a small wrapper that expose the pubsub features you need over http, this is different from Kubo doing it because you have application knowledge, you can write something that correctly fits your usecase (for example if you are doing a CRDT you can have a validator that compares messages heights and remove messages that are not current anymore).

The beauty of having an integrated pubsub RPC API is that you can write in any language that has bindings for kubo's RPC, without knowing (or caring, tbh) about the internals of all this, and take advantage of the rest of the IPFS API.

There will be a binary that provides the same features if you are happy with them, I would be surprised if very few peoples sees issues like #9665. Which are not trivial to solve. Also the current message seen cache code seems to use more and more CPU for each message as the cache grows.

fcbrandon commented 1 year ago

My project relies on both IPFS and pubsub. By pulling it out of kubo, and having projects rely on libp2p instead, wouldn't that leave us to effectively load libp2p twice, once in kubo, then a separate instance of libp2p to run pubsub?

It seems that folks using pubsub via kubo are perfectly happy with the level of accessibility that's currently offered, and would prefer to continue to use it. The alternate solutions provided above make projects much more cumbersome to manage, requiring support for multiple languages isn't ideal (wrapper), or having to now manage an additional process (go-libp2p-pubsub) adds further complexity in deployment, maintenance, and usage.

I'd propose leaving the API in its current state, and if someone is unhappy with how it's working, leave it to them to implement their changes, and just remove it from the primary devs' roadmap.

The feature is only a performance issue for nodes that explicitly enable it, correct?

RangerMauve commented 1 year ago

I rely on the Kubo RPC API for accessing pubsub from an Electron app. Removing this would mean I could no longer use the go-ipfs NPM module or Kubo in general and would need to totally rework how I integrate IPFS into my applications.

I suppose this could be a reason to ditch Kubo entirely and embed just a subst of it into a custom HTTP API? It would certainly make it harder to deploy and reuse things like IPFS-Cluster with the node.

BigLep commented 1 year ago

I just wanted to say thanks for folks sharing about their usecases and needs. For transparency, Kubo maintainers havne't done any work on this yet. In case it wasn't clear, the migration path/plan will be designed and communicated before we undergo this work. Updates will be posted here. In the meantime, feel free to continue to share.

pinnaculum commented 1 year ago

I'm saying that you would write a small wrapper that expose the pubsub features you need over http, this is different from Kubo doing it because you have application knowledge, you can write something that correctly fits your usecase

I've thought about that (expose some kind of RPC in the wrapper) but did not mention it in my message. The reason why this approach would be problematic for my project, and @fcbrandon talks about this as well, is that as i understand it you would have two parallel libp2p instances/nodes, the kubo's libp2p instance (that would not have pubsub "capabilities") and the libp2p instance running in the "wrapper" that uses go-libp2p-pubsub and thereforce can exchange pubsub messages. Am i wrong about that ? I use IPFS peer IDs as a "unique key" in a PeerID <=> DID mapping. Having two nodes and therefore distinct peer IDs makes it much more difficult, but not impossible, and also performance wise it's not ideal.

Just exposing my thoughts about this approach, but maybe for other people's usecases it wouldn't be a problem at all. I wonder if a compromise could not be found, by having kubo keep using the "default" pubsub validator (the "BasicSeqnoValidator" ?, which is probably inefficient), while letting people who want better control with the option of using go-libp2p-pubsub's API and set up their custom validators, etc ...

I've read validation_builtin.go. If better validators are implemented in the future, then there could be a /api/v0/pubsub/set_topic_validator kubo RPC API call that would just pass the name of a builtin pubsub validator (and some optional settings maybe), to set the validator to use for a given topic. Then, no need for messy callbacks, and kubo's pubsub implementation would strengthen over time with different kinds of validators ? I think the implementation of validators should stay in go-libp2p-pubsub. Once you take this problem out of the equation, the case for deprecating the pubsub API in kubo is not as strong because most people using pubsub with kubo are probably fine with letting kubo use the best validator available.

estebanabaroa commented 1 year ago

I just wanted to say thanks for folks sharing about their usecases and needs.

One our usecase would be to know the IP address of the direct peer who sent a message (not the origin), to be able to block IPs of peers that relay bad messages. Also being able to block IPs in the pubsub would be nice. I know you can do it with swarm filter but that blocks the IP from everywhere.

IMO having a basic pubsub API included with kubo is good for testing, prototyping and discovering that it even exists, even if it can't be customized. I'm not sure I would know it exists if I didn't randomly see it as an option in IPFS one day.

In our app we bundle kubo with electron, and use both ipfs and pubsub, so having 2 binaries, one for ipfs and one for pubsub would mean a larger bundle. Could it also mean more resource/bandwidth usage if the user has to run both at the same time?

Also no one in our team knows go, so the ideal scenario would be for pubsub to remain in kubo, but with more configuration options, for example in our case blocking IPs of direct peers. The second best scenario would be a separate pubsub binary and RPC with more configuration options. The least ideal (but not dealbreaking) would be to use go-libp2p-pubsub.

lidel commented 11 months ago

Thank you all for feedback and reaching out.

I had some related conversations about this during IPFS Thing, and people would appreciate having basic pubsub built-in bit longer, with whatever opinionated defaults we want.

Cold cut, and removing pubsub RPC commands it will cause bigger pain than libp2p-relay-daemon because we talk end user API. Pain for both users, and maintainers ("there will be a binary that provides the same features if you are happy with them" → someone has to do it, maintenance cost time, maybe we can spend it elsewhere?).

Given that:

  1. js-ipfs and js-ipfs-http-client are deprecated, and specific to JS-IPFS, and not Kubo
  2. we point people at Helia and kubo-rpc-client, and these no longer share the API nor have interop tests for RPC commands

maybe we could consider a more gentle deprecation path, or at least do a stop-gap:

This way we could keep /api/v0/pubsub as deprecated, discouraging its use, but would not have to remove it nor invest time into creating yet another daemon as a drop-in replacement.

Could be safer path. Removing interop with JS-IPFS, but keeping commands in Kubo for now buy us some time/options:

pinnaculum commented 11 months ago

This way we could keep /api/v0/pubsub as deprecated, discouraging its use, but would not have to remove it nor invest time into creating yet another daemon as a drop-in replacement.

Excellent reasoning. Will the merge of #9684 break pubsub communications with previous kubo versions (say .. 0.18.x), or does it only affect the validator ?

BigLep commented 11 months ago

2023-05-18 conversation:

  1. Generally aligned to go with the less disruptive approach: https://github.com/ipfs/kubo/issues/9717#issuecomment-1543827199
  2. We need check the validator that "ipns over pubsub" uses. We need to know what we're in for.
  3. Changelog entry to make it clear that we're breaking interoperability with js-ipfs (i.e., no longer testing for it or guaranteeing), but js-ipfs is deprecated so that is fine.
BigLep commented 11 months ago

@pinnaculum :

Excellent reasoning. Will the merge of https://github.com/ipfs/kubo/pull/9684 break pubsub communications with previous kubo versions (say .. 0.18.x), or does it only affect the validator ?

Good question. We don't believe it breaks compatibility between Kubo versions.

zacharywhitley commented 11 months ago

Will this affect and if so how will this affect IPNS publishing over pubsub? The --enable-namesys-pubsub option.

emendir commented 11 months ago

Here's another developer who'll mourn the loss of PubSub. It's been an amazing feature that enabled so much. Now I'll have to rebuild similar functionality for my projects that rely on it, as switching away from IPFS to go-libp2p isn't an option. I'll post here when I've got something usable, but as it's built on top of ipfs it won't be integratable into every project. 😔

Winterhuman commented 11 months ago

@zacharywhitley It won't thankfully, see https://github.com/ipfs/kubo/issues/9795.

emendir commented 10 months ago

Let’s discuss some issues of concern when deprecation pubsub (and other libp2p features in the future?): Interpretable as:

So deprecating IPFS' access to libp2p's pubsub feature focuses IPFS' implementation spectrum more on the filesystem, and reduces the number of the rest of libp2p’s capabilities that developers using IPFS have access to. This means that applications which have used IPFS’ pubsub interface have to move away from using IPFS and instead implement libp2p themselves. This in turn means that many users of those applications will end up running multiple instances of libp2p: in the applications that use pubsub and in IPFS (which they may want to use for other purposes, I mean, IPFS is so cool!).

Many (I hope most!) of us developers here have a vision of building a P2P internet, and we’re using libp2p to realise that. More and more applications will be built on top of libp2p’s capabilities. Do we want them each to run their own libp2p instance, or would it be more resource-efficient for them to access the libp2p instance inside of IPFS so that each computer only needs to run one libp2p instance? Another issue of concern is ease of development: using IPFS’ pubsub feature in an application was fairly easy, and could be done in a variety of programming languages from shell to python. Implementing libp2p in a project is a whole lot more difficult, and so reduces the ability of developers to build P2P applications.

Conclusion:

Before deprecating pubsub and other libp2p features in IPFS, we need to answer the following questions:

BigLep commented 10 months ago

Thanks for the continued feedback here. Just an update from the maintainers that changes here aren't going to make it in for Kubo 0.21. We'll be targeting it for Kubo 0.22. We'll engage more during that development iteration.

sevenrats commented 10 months ago

PubSub is a critical component of Kubo and removing it in any release would be a massive loss. PubSub should remain a core component of ipfs reference implementations.

Jorropo commented 10 months ago

@emendir in your message however you confuse IPFS and Kubo, AFAIT you assume that all applications that build on IPFS need to run a Kubo daemon and interact with IPFS through Kubo's HTTP API, thus the two libp2p instances. This is exactly what we do not want to happen anymore. :sparkles: With the push for boxo we are moving to a library oriented story for people who want to build on our stack, not whatever Kubo's HTTP API. So you wouldn't run Kubo anymore next to your app, you would use boxo (and or other libs) inside your own application process. So you could use a single libp2p instance shared by boxo, go-libp2p-pubsub, ... or even do IPFS without libp2p.

Here is a talk about boxo if you are into that: https://www.youtube.com/watch?v=uFr4EtySorY else you can browse the repo: https://github.com/ipfs/boxo, right now boxo is still pretty much made of the same libraries that powered Kubo before except in one repo (with a few notable exceptions for example the new reusable gateway API handler) we are working on improving this and is one of the main reason I don't have time to fix pubsub.

If you had feedback on how you could have learnt that Kubo != IPFS this would be nice, we tried putting it everywhere but I guess there is some path we missed because you slipped through the cracks. :slightly_smiling_face:


The main reason I want to remove pubsub from Kubo is that right now it is broken, there is a pull request that has been opened that makes it slightly less broken (but still very broken) however it also breaks more things in the process.

The underlying issue is that implementing a good pubsub mesh (like the Filecoin ones for example) require very application specific knowledge (you need to use your own application state to implement message filtering, discarding of outdated messages and rate limiting) which we can't do in Kubo because this is unique to each application. go-libp2p-pubsub's API is in term of callbacks, everytime the pubsub daemon receives a new message it need to invoke some magic piece of code the consumer wrote that tell it if the message is relevent, if it should be discarded if the node should be rate limited, and implementing callbacks over HTTP sucks, however in go this is very easy you just pass a function pointer.

Thus from my selfish view (and I would like to hear the other view):

I could spend my time fixing bugs that harm more critical and more used features (filesharing), making things faster, writing new libraries that are easy to use, ... why should that time instead be spent on writing an unperformant HTTP callback API that will be hard to use, confusing and easy to get wrong when a type safe performant easy solution already exists (calling go-libp2p-pubsub directly). Having an everything HTTP API was a trap, it takes way to much man power to make close to performant,, it is not customizable enough, it makes it very hard for new comers to get into the weeds because the language they learn (HTTP API) is different from the language used at the lower level in the libraries. This lead to a perverse view where everyone that want to build / add IPFS features have to get them merged in Kubo instead of having a diverse implementations ecosystem.

I see 4 3 options:

  1. Let consumers (you) write a bit of Go, you would import the go-libp2p-pubsub directly, we could also provide libraries (in boxo) and examples on how to get started. Unlike what were suggested before you wouldn't need to re-implement all of Kubo's features like peer management, ... because this is / will be provided by libraries, we want a story where in a few tens of line of code you can pick and choose all modules you need and have a working starting point. Kubo's codebase is way bigger than yours would be because it has to handle a huge interdimensional matrix of various features and configs.
  2. We would provide the same API as go-libp2p-pubsub but over HTTP, that means callbacks and you can write the same application decision code but from outside Kubo. This is not an easy task and will take lots of time. Would require lots of work and the API will always suck, it will be confusing (for example if you would run pubsub without your app listening for callbacks the pubsub service inside kubo will stall).
  3. Leave it as is. The pubsub meshs created by Kubo are and will be unstable, we would change pubsub's API description from experimental to something along the line of the pubsub API does not work and is limited, to use a performant reliable alternative please use go-libp2p-pubsub, we would rename --enable-pubsub-experiment to something along the line of --enable-pubsub-broken. We maybe would also add arbitrary restrictions in order to limit it's usage to PoC, demos and learning (again forcing power users onto solution 1.), for example no more than X messages per second.
  4. Buy into the treadmill problem (for example PR #9684), that means adding more and more complex check's as newer bug reports are added I don't think is an acceptable solution because we already don't have enough free time for pubsub for this feature, we really can't commit to running a treadmill race against the bugs and bugs reports as treadmill problems drain all available time and more. Also this would yield a more and more complex pubsub layer that could have higher and higher costs and reduced utility, for example a nuclear option would be to make Kubo's pubsubs channels a Proof-Of-Work blockchain and require to pay gas to send messages, it sounds like we could make it work (given a lot of time) but now you need to sync TiBs of data and run a GPU mining rig to send pubsub messages, I don't think anyone want this.

Note: solution 1 and 3 are reasonable enough in time investment, however even if there is an overwhelming consensus around solution 2 I'll still lack the time to implement it and maintain it, I guess this could maybe happen if someone steps up to help ? (idk) solution 4 is a half serious joke, I wont even consider doing this.

Now I have two questions for everyone who gave us feedback (:heart: btw):


My own votes are on 1 and 3 (leave pubsub API in Kubo but be extremely clear that it is broken and push people to use go-libp2p-pubsub). Or 1 alone (that means remove pubsub api completely from kubo and push peoples to go-libp2p-pubsub).

sevenrats commented 10 months ago

Three sounds like a perfect solution. Everybody complaining is using it as-is without problems, presumeably. Also, it doesn't diminish your ability to deprecate and remove later, while still pushing people towards the correct long-term solution. Who knows, maybe if you give us another three years of broken pubsub, we will all adjust to the idea of writing a few dozen lines of go (but probably not).

estebanabaroa commented 10 months ago

If you had to chose options, which options would you pick (feel free to choose or and rank multiple options) ?

I am an building an app that uses all IPFS, IPNS over pubsub and pubsub, and I dont know go, I looked into creating a default pubsub validator, adding it to the kubo codebase, and building kubo with my custom validator, and it only took me like a few days to get working.

However, if the pubsub APIs were completely removed from kubo, and I had to handle the RPC endpoints, customizing the libp2p host, dealing with the kubo CLI options, etc. that would be overwhelming.

One thing that would have made my work even easier is if there was a minimal example project using a default validator and peer filter. The pubsub examples in https://github.com/libp2p/go-libp2p/tree/master/examples/pubsub don't use validators or filters so it was confusing, especially since I had never used go before.

Jorropo commented 10 months ago

@estebanabaroa IPNS over pubsub is not concerned by this issue.

The pubsub examples in https://github.com/libp2p/go-libp2p/tree/master/examples/pubsub don't use validators or filters so it was confusing, especially since I had never used go before.

Thx that important to know, but then I think the work item would be to make thoses proper examples.

estebanabaroa commented 10 months ago

@estebanabaroa IPNS over pubsub is not concerned by this issue.

Yes I know, what I meant is that since my app uses almost everything included in kubo right now, it would be overwhelming to have to create it from scratch using components from boxo, especially since I dont know go. I think boxo is a great idea though for people who dont use all the features in kubo.

emendir commented 10 months ago

First of all, thank you very much for putting so much time into your clarification. I find the library-oriented approach with Boxo a great idea and a sensible way to push forward. Can't wait to dive in deeper and check it out more thoroughly. It's probably my own fault that I didn't learn about Boxo before as I've been way too busy with all sorts of things and am still learning to get and filter through news.

emendir commented 10 months ago

My vote on how to move forwards are options 1 & 3 (leave PubSub API in Kubo but be extremely clear that it is broken and push people to use go-libp2p-PubSub).

My reasons for wanting to leave the Kubo PubSub Endpoint are:

pinnaculum commented 10 months ago

Thank you @jorropo for the detailed explanations.

Solution 2 is time-consuming and using callbacks is wrong.

  1. Leave it as is.

I think it's acceptable, flag the feature as "incomplete", "broken", users are warned and you can point to go-libp2p-pubsub .. That way you halt the flow of tears of the developers already using kubo's pubsub and who don't have a backup plan, but also you keep the possibility of fixing pubsub's flaws in the near future if a better solution comes along.

Jorropo commented 10 months ago

That way you halt the flow of tears of the developers already using kubo's pubsub and who don't have a backup plan.

That the goal.

but also you keep the possibility of fixing pubsub's flaws in the near future if a better solution comes along.

  1. It's Kubo's pubusb which is broken, go-libp2p-pubsub works fine (when used correctly).
  2. I don't belive Kubo's pubsub can be fixed, the current promise in docs is that it's experimental but it's experimental warnings don't work on people after a few years (which fair, if something is marked experimental for years and peoples aren't using it I wouldn't belive it's experimental). The sad thing is that I don't think there is any way to keep the current sementics / features that don't also do huge compromises (making pubsub very expensive, or removing features like requiring a list of trusted peers who are exclusively allowed to send on the channel).
pinnaculum commented 10 months ago

I don't believe Kubo's pubsub can be fixed

The sad thing is that I don't think there is any way to keep the current sementics / features that don't also do huge compromises

Sad. Do what must be done, make it quick and painless :) A ceremony will be held for the orphaned pubsubers, topic name livefast_dieyoung.

zacharywhitley commented 10 months ago

I think it's acceptable, flag the feature as "incomplete", "broken”

it’s currently listed as experimental, just sayin’

pinnaculum commented 10 months ago

it’s currently listed as experimental, just sayin’

I was just confirming @jorropo's idea in the third solution:

.. we would change pubsub's API description from experimental to something along the line of the pubsub API does not work and is limited, to use a performant reliable alternative please use go-libp2p-pubsub, we would rename --enable-pubsub-experiment to something along the line of --enable-pubsub-broken

MichaelMure commented 10 months ago

@emendir , @pinnaculum , @TheDiscordian , @RangerMauve @sevenrats, and everybody who care about pubsub:

See https://github.com/MichaelMure/ipfs-pubsub-service-api/blob/master/pubsub-service-api.yml

Readme rendered there: https://gist.github.com/MichaelMure/87296599c5ec3f6ad08468bef7b66d68

I've been working on a replacement API for pubsub that improve on it in several way:

I believe that API could be a viable replacement for application use-cases. I should also solve that validation problem that kubo has.

It is however only a spec for now. I'm planning to push it as an IPIP later when I get the time. It would be extremely useful to have early feedback on it. Could you open issues in that repo if you have some feedback?

RangerMauve commented 10 months ago

I like the approach! One thing however, it'd be nice if one could also open something like an EventSource stream to subscribe to changes in realtime to remove the necessity of polling.

MichaelMure commented 10 months ago

@RangerMauve I do see as well the interest in a "real time" reading of message, without polling. @lidel's advice was to keep the spec simple, at least for a first iteration. It's debatable, but he might be right. He at least have the experience to back the argument ;-)

Let's not derail that issue too much though, and keep those feedback over there: https://github.com/MichaelMure/ipfs-pubsub-service-api/issues

emendir commented 10 months ago

So you wouldn't run Kubo anymore next to your app, you would use boxo (and or other libs) inside your own application process. So you could use a single libp2p instance shared by boxo, go-libp2p-pubsub, ... or even do IPFS without libp2p.

Let consumers (you) write a bit of Go, you would import the go-libp2p-pubsub directly, we could also provide libraries (in boxo) and examples on how to get started.

After these comments from @Jorropo on the alternatives to using Kubo's PubSub RPC API, I started looking into what we users of the PubSub RPC API are supposed to do now.

I don't know how others are getting along with this, but my path so far has been pretty rough.

After trying out different ways of interfacing libraries written in go with the language I work in (Python), I started looking into exactly which IPFS libraries I want to work with.

First of all, it was slightly overwhelming that there are so many different libraries to choose from.

As I understand it so far, we can break those libraries down into the following categories:

So @Jorropo and other IFPS & Kubo developers, could you please critique my above representation of the categories of IPFS go libraries so that we IPFS-application-developers who are begging you not to remove Kubo's PubSub RPC API can better judge which direction we need to move on to?

I can tell that the IPFS & Kubo developers are working hard on building better ways for us IPFS-application-developers to use it, but the matter of the fact is that the documentation and tutorials are currently mediochre at best.

Part of my difficulties with this endeavour stem from the fact that this is the first time I use the go programming language, but as I won't be alone with this trouble. The RPC API was a great workaround for this, and pushing developers away from that convenience will take time. Therefore, I am of the strong opinion that it is much too early to remove the PubSub RPC API from Kubo.

Thank you go-libp2p, IPFS & Kubo developers for all your work. I really appreciate it.

estebanabaroa commented 10 months ago

I don't know how others are getting along with this, but my path so far has been pretty rough.

My approach has been to create a custom pubsub validator function for our app, as well as a custom messageId and AppSpecificScore function. And then fork kubo, and add a few lines of code to kubo to pass the options.

I was able to do this in a few days even if I had never used go before.

In kubo, only a few lines need to be added to use it:


import (
    pubsubPlebbitValidator "github.com/plebbit/go-libp2p-pubsub-plebbit-validator"
)

validator := pubsubPlebbitValidator.NewValidator(host)
peerScoreParams := pubsubPlebbitValidator.NewPeerScoreParams(validator)
return pubsub.NewGossipSub(helpers.LifecycleCtx(mctx, lc), host, append(
    pubsubOptions,
    pubsub.WithDefaultValidator(validator.Validate),
    pubsub.WithMessageIdFn(pubsubPlebbitValidator.MessageIdFn),
    pubsub.WithPeerScore(&peerScoreParams, &pubsubPlebbitValidator.PeerScoreThresholds),
    pubsub.WithDiscovery(disc),
    pubsub.WithFloodPublish(true))...,
)

https://github.com/plebbit/kubo/blob/e0026b9a4145b7e916ed68d461320abe6ff3504b/core/node/libp2p/pubsub.go

Then I made a github action to build the modified kubo binaries and put them in a github release: https://github.com/plebbit/kubo/releases/latest. My own app downloads it from there when building.

I imagine if the pubsub APIs were to be removed completely from kubo (instead of just being deprecated), I would readd them myself by copy pasting the code from previous versions. I wouldn't be able to use boxo or go-libp2p-pubsub directly, it seems too difficult and also my app uses IPFS and IPNS anyway and I don't want to have 2 instances of libp2p running.

emendir commented 10 months ago

@estebanabaroa thanks for sharing your experience!

I agree that the parallel running of 2 libp2p instances is a concern. It is the worry that nags me the most as I explore the go-library-based approaches. Wastes processing resources, bandwidth, I guess it will crash routers more often, and will require some apps to autostart and run in the background just to keep their IPFS nodes runnning where they previously didn't have to.

emendir commented 10 months ago

@Jorropo You mentioned:

2. The sad thing is that I don't think there is any way to keep the current sementics / features that don't also do huge compromises

It wouldn't be the first time things have changed in PubSub, remember the release of v0.11.0?

HTTP RPC wire format for experimental commands at /api/v0/pubsub changed.
If you use /api/v0/pubsub/* directly or maintain your own client library, you must adjust your HTTP client code.

https://github.com/ipfs/kubo/releases/tag/v0.11.0

I remember having to figure out how to adjust a library I maintain to make it compatible with the new version.

Since Kubo's PubSub endpoint is marked as experimental such changes are readily forgivable

estebanabaroa commented 10 months ago

So in a few days you managed to create a solution for yourself to do what we previously took 20min to get the hang of with HTTP RPC API libraries. We need to get that back to 20min for future newbies.

I agree but not sure it's possible, the message id function, validator function and app specific score function could be called thousands of times per second, they can't do a network round trip. Also different apps might need different internal data from libp2p or pubsub, so all data would need to be included in the network round trip.

IMO pubsub should remain an API in kubo at least as a demo, even if it can't ever be used in production, otherwise how would people ever discover that it exists and try it. It could also include some extra configuration, like multiple message id functions and validators that people can test out, even if it can't be used like that in production.

Arlodotexe commented 9 months ago

With the push for boxo we are moving to a library oriented story for people who want to build on our stack, not whatever Kubo's HTTP API.

It's notable that writing Go code just isn't an option for many of us. The dotnet ecosystem can't keep up with development of Kubo and libp2p, so we've been bootstrapping Kubo and using the RPC HTTP API instead.

Until a stable WASM/WASI implementation of libp2p comes along, we can't even begin to move to this new library-oriented story.

That in mind, this option suggested above by @Jorropo seems like the cleanest move:

Leave it as is. The pubsub meshs created by Kubo are and will be unstable, we would change pubsub's API description from experimental to something along the line of the pubsub API does not work and is limited, to use a performant reliable alternative please use go-libp2p-pubsub, we would rename --enable-pubsub-experiment to something along the line of --enable-pubsub-broken. We maybe would also add arbitrary restrictions in order to limit it's usage to PoC, demos and learning (again forcing power users onto solution 1.), for example no more than X messages per second.

BigLep commented 9 months ago

Another update that this won't make it into 0.22 since with summer holidays and other in-flight work, maintainers won't be able to give this the necessary attention to respond and handle well. It has been bumped to 0.23.

We appreciate the ongoing feedback that has been left. Thanks.

cottrell commented 3 months ago

I see various comments but is https://github.com/libp2p/go-libp2p-daemon the correct thing now?

Remember people land here from the CLI and HTTP use.

It might make sense to point to "please use this CLI/HTTP" in the deprecation warning in Kubo message for pubsub command.

lidel commented 2 months ago

Triage notes:

driusan commented 1 month ago

I came across this issue when getting a deprecation notice after using an ipfs pubsub command from some documentation I found on the web while searching for "ipfs pubsub". Most of the documentation I've found / blog posts / etc seem to use it as a wrapper to demonstrate the functionality and removing them from the commandline seems like it will many examples in the wild.

The problem with "option 1" above is that it forces developers to use Go for their own applications. An HTTP API provides a fairly universal language agnostic interface. My understanding from the above thread is that the only technical problem with exposing the functionality over HTTP is the need for callbacks, so I'd like to propose another option: use WebSockets.

You subscribe to a topic by making an HTTP connection to an API endpoint, upgrading the connection to WebSocket, and then topics you're subscribed to get broadcast over that connection. This would provide a similar language-agnostic bridge to the functionality without forcing developers to write the pubsub parts of their applications in Go.

cottrell commented 1 month ago

I came across this issue when getting a deprecation notice after using an ipfs pubsub command from some documentation I found on the web while searching for "ipfs pubsub". Most of the documentation I've found / blog posts / etc seem to use it as a wrapper to demonstrate the functionality and removing them from the commandline seems like it will many examples in the wild.

The problem with "option 1" above is that it forces developers to use Go for their own applications. An HTTP API provides a fairly universal language agnostic interface. My understanding from the above thread is that the only technical problem with exposing the functionality over HTTP is the need for callbacks, so I'd like to propose another option: use WebSockets.

You subscribe to a topic by making an HTTP connection to an API endpoint, upgrading the connection to WebSocket, and then topics you're subscribed to get broadcast over that connection. This would provide a similar language-agnostic bridge to the functionality without forcing developers to write the pubsub parts of their applications in Go.

yes, this is what I think works. Currently you need to implement this websocket upgrade in go. I managed to get some experiment working like this (go api and python client). It would be nice to have some shared cli to use and develop against.

emendir commented 2 weeks ago

I came across this issue when getting a deprecation notice after using an ipfs pubsub command from some documentation I found on the web while searching for "ipfs pubsub". Most of the documentation I've found / blog posts / etc seem to use it as a wrapper to demonstrate the functionality and removing them from the commandline seems like it will many examples in the wild.

The problem with "option 1" above is that it forces developers to use Go for their own applications. An HTTP API provides a fairly universal language agnostic interface. My understanding from the above thread is that the only technical problem with exposing the functionality over HTTP is the need for callbacks, so I'd like to propose another option: use WebSockets.

You subscribe to a topic by making an HTTP connection to an API endpoint, upgrading the connection to WebSocket, and then topics you're subscribed to get broadcast over that connection. This would provide a similar language-agnostic bridge to the functionality without forcing developers to write the pubsub parts of their applications in Go.

I agree that WebSockets are probably the best way to handle pubsub. I tried building a kubo plugin to implement this in ZMQ several months ago, but sadly kubo's plugin interface was too much of a hassle to get working, and it wasn't yet supported on all systems.

For anybody with more success than me in getting started building kubo components, I do recommend ZMQ, a WebSocket technology built on top of TCP/IP that already has pubsub functionality built in. It has wide support for many programming languages.