Open BigLep opened 1 year ago
This is more of a libp2p issue, and more of a go-libp2p issue
While support for WebRTC will certainly help in some scenarios (i.e. the browser does not support WebTransport and the node fetching the data doesn't have a WSS address with a CA cert) IIUC the main difficulty in getting data from browser helia nodes discoverable by gateways, etc. is data not being advertised to the DHT, IPNI, etc.
Am I mistaken and it turns out advertising small amounts of data to the DHT from a helia browser node is working well enough at the moment (at least for browsers that support WebTransport)?
@aschmahmann even in a browser that supports WebTransport, i've been having difficulty getting any successful webtransport connections. I just pushed up a repo where I was playing around: https://github.com/SgtPooki/helia-playground -- it was essentially copied from https://codesandbox.io/p/sandbox/helia-script-tag-forked-3q8y35 to a local workspace so i could modify things more easily.
One thing I started seeing was that activeStreams.length never breaches 0 for me, no matter how many peers or how many connections I have. I suspect a bug in libp2p/webtransport but I haven't been able to fully track it down.
I want to create a simple test where a browser helia node can successfully talk to a backend helia node, but that will have to wait for a bit.
ninja-edit:
Also, there seems to be a non-stop spamming of webtransport dial attempts.. and i'm not sure how best to control that with libp2p-connection-manager.
@aschmahmann : good callouts - thanks.
Let's assume:
How does this ipfs.io Kubo node retrieve the data from the browser node? My understanding is that it still can't initiate a connection to the browser in this scenario and this scenario would only work if there was a preexisting connection between the browser node and the ipfs.io Kubo node.
Also, I expanded the "Notes" section in the top description to further expand on the underlying issues:
Please go ahead and fix/correct any mistakes here.
Thank you!
How does this ipfs.io Kubo node retrieve the data from the browser node? My understanding is that it still can't initiate a connection to the browser in this scenario and this scenario would only work if there was a preexisting connection between the browser node and the ipfs.io Kubo node.
Yeah, that's right good callout. I had assumed there was some level of support for DCuTR in js-libp2p that came along the relay-v2 support. With the simplest DCuTR support (dialbacks) what would happen is that the helia node would connect to a (limited) relay-v2 node that speaks some protocol the helia node can speak (e.g. WSS, WebTransport, etc.) and they would then have as their address /the/multiaddr/of/the/relay/circuit-relay/p2p/helia-node-peerID
which when a publicly reachable node (e.g. the ipfs.io kubo nodes) wanted to contact the helia it would ask the relay to have the helia node dial it back (using WSS, WebTransport, etc.).
This doesn't require any holepunching kinds of magic, just a simple relay + the dialback portion of the DCUtR protocol.
Seems like it might be worth scoping this as a smaller and more important set of work in https://github.com/libp2p/js-libp2p/issues/1460.
Notes from Helia WG 2023-07-27
DCUtR for js-libp2p is in progress here: https://github.com/libp2p/js-libp2p/pull/1928
Note that the libp2p hole-punching vision table also illustrates the problem here fairly well: https://github.com/libp2p/specs/blob/d2106f43e878ae4c3a1c6465a7c329835290fe22/connections/hole-punching.md#vision
It's great that progress is happening here.
Folks have correctly pointed out that for the stated usecase of Kubo ipfs.io gateway retrieving content from the browser that go-libp2p WebRTC isn't needed. We only need js-libp2p DCUTR. That's great, and I agree that should be the first usecase.
That said, I don't want to let up there since the ultimate is "universal connectivity". The next step here is to allow a private Kubo node (e.g., Kubo running in one's Brave browser) to fetch the content authored in the browser. For this we ultimately need Kubo to support WebRTC since WebRTC is required for browser/private-node connectivity per here. This can come after, but I have updated the issue notes to be accurate and to discuss this followup step.
I've been doing a bit of investigation, what I've found is:
ADD_PROVIDER
query frequently traverses through nodes it can't dialADD_PROVIDER
succeeds, Kubo nodes (my local one at least) can't always resolve the recordBrowser CPU usage is very high, this may contribute to 1. 2. is quite concerning because if the relay address changes the published provider records then have out of date multiaddrs.
Right now I think in the circuit relay code if a relay connection is lost we assume the relay is bad and we start to search for new relays, but we may need to assume that we are bad and make some sort of attempt to reconnect, if that fails then start searching for others.
Until adoption of webtransport
improves, we may need some sort of web service that can publish provider records on behalf of the browser? But ones where the browser is the provider, not the web service so is slightly different to the delegated content routing strategy we used to use.
Also found a few other weird bits and pieces
@achingbrain
I've been doing a bit of investigation,
Thanks - good write up!
(For others to be aware) per 2023-08-10 Helia Working Group, I don't think it's not worth the investment right now to focus on writing provider records directly to the public IPFS DHT from the browser. We'll instead rely on solving the write-side of "Underlying issue 1: discoverability of the content created in the browser so that the ipfs.io gateway can discover it" through to-be-created/updated delegated routing endpoint. Kubo/Boxo maintainers are aware of the priority of this work and are taking it on now as they finish up the read side of HTTP /routing/v1.
Also, it sounds like you have a test setup (awesome). I assume we're going to need this throughout the golden path development. If there is anything to document or check in to help others in testing or verifying their work, please share.
I have updated the task list in the issue description with everything I'm aware of that needs to be done along the different tracks:
Thanks also for the fixes along the way - good stuff!
I has morphed this golden path issue to be scoped to retrievability of browser authored content without relying on pinning services (i.e., as long as one's browser tab is open).
For retrievability of browser-authored content, we're going to focus first on relying on pinning services: https://github.com/ipfs/helia/issues/256
That said, the top priority is reliable browser retrieval of any content. This is happening in https://github.com/ipfs/helia/issues/255 . This is the top "golden path scenario" focus.
Browser connections are unstable This causes remotes to drop connections, including relay connections
In recent releases this is much improved:
hi, does this work already?
I have seen this one here: https://github.com/ipfs/kubo/issues/9724
Looks like kubo is now supporting webrtc out of the box...
Does this mean, we can create content inside the browser using helia peer with kubo and retrieve the content from any ipfs gateway? Is that correct? maybe I have a wrong understanding, but we could remove the blocked status ;)?
Done Criteria
A user can author content in their browser via Helia and have it retrievable by another machine through the ipfs.io gateway without relying on pinning services or preload nodes.
Why Important
This is a common usecase that users hit. Failure here feeds the narrative that "IPFS doesn't just work".
Notes