ipfs / go-bitswap

The golang implementation of the bitswap protocol
MIT License
216 stars 112 forks source link

Bitswap not sending wanted blocks #68

Closed alanshaw closed 5 years ago

alanshaw commented 5 years ago

I'm following up on https://github.com/ipfs/js-ipfs/issues/1874 and I'm seeing an issue with the go-ipfs nodes deployed as the preload nodes for js-ipfs.

The situation I'm seeing is:

  1. browser js-ipfs node is connected to go-ipfs node over websockets (wss)
  2. browser node adds CID to wantlist
  3. bitswap message is sent to go-ipfs successfully (as far as I can tell)
  4. no response from go-ipfs node with block data
  5. browser node recieves multiple messages from the go-ipfs node with it's wantlist (so communication is possible)

We know that this node HAS this content because https://node0.preload.ipfs.io/api/v0/refs?r=true&arg=QmS8hKtny73KoTh8cL2xLoLHjMDjkG9WYF3AHAdd9PHkxo returns 200.

@scout found this is the syslogs for the go-ipfs node:

bitswap: #033[0mSendMsg errored but neither 'done' nor context.Done() were set #033[0;37mwantmanager.go:236#033[0m
hannahhoward commented 5 years ago

Hi @alanshaw I'll jump on this but do we know if the preload node is running 0.4.18?

It would help me to understand if the the problem is in new bitswap or existing bitswap.

The message @scout found is interesting cause it has to do with the go-ipfs node sending wants, not blocks. Moreover, the file line number "wantmanager.go:236" suggests the node is running 0.4.18, because that log message is no longer in that file.

Kubuxu commented 5 years ago

AFAIK preload was running 0.4.18.

hannahhoward commented 5 years ago

Yea. That makes sense. I assume that not sending blocks back is a problem of the decision engine, which has not changed significantly. I can take a look. I think though we may need to dig further into logging to get a sense of what might be wrong.

obelisk-dev commented 5 years ago

Any timeline on this issue? This breaks many current use cases for jsipfs.

alanshaw commented 5 years ago

@hannahhoward @Kubuxu I recorded a video to help demonstrate the issue I'm seeing:

https://youtu.be/gOsmD90LF24

It shows that connecting to a local go-ipfs 0.4.18 node does NOT exhibit the issue, but connecting to the preload nodes does.

It shows multiple "Unexpected end of input" errors in the logs when connecting to the preload nodes. I assume that this must be a problem with the deployment of the nodes in some way since it works locally.

This is not an existing issue - this used to work up until recently and now I am getting lots of complaints from JS IPFS users who cannot access content at all.

I am available to offer any help I can but I have no access to the preload servers and no ability to debug go-ipfs code so please help 🙏

If you'd like to replicate the issue yourself the script I'm using is:

<!doctype html>
<html>
<body style="margin:0;">
<pre id="out" style="width:100%;height:100vh;margin:0;"></pre>
<script src="https://unpkg.com/ipfs/dist/index.min.js"></script>
<script>
  const Prelog = pre => {
    return (...args) => args
      .map(a => typeof a === 'string' ? a : JSON.stringify(a, null, 2))
      .map(a => document.createTextNode(`${a}\n`))
      .forEach(n => pre.appendChild(n))
  }
  const log = Prelog(document.getElementById('out'))
  // const bootstrapAddr = '/ip4/127.0.0.1/tcp/9999/ws/ipfs/QmNcRXmhoVAvhFFhmeYMq4kV66JhQDmDNBR3ZPcERc73rH'
  const bootstrapAddr = '/dns4/node0.preload.ipfs.io/tcp/443/wss/ipfs/QmZMxNdpMkewiVZLMRxaNxUeZpDUb34pWjZ1kZvsd16Zic'

  const ipfs = new Ipfs({
    repo: 'ipfs-testing' + Date.now(),
    config: {
      Bootstrap: [bootstrapAddr]
    },
    libp2p: {
      config: {
        peerDiscovery: {
          bootstrap: {
            enabled: true
          },
          webRTCStar: {
            enabled: false
          },
          websocketStar: {
            enabled: false
          }
        }
      }
    }
  })

  log('Booting JS IPFS...')

  ipfs.on('ready', async () => {
   log('JS IPFS is ready')
   log('ID:', await ipfs.id())

   log('Waiting to be connected to: ' + bootstrapAddr)

   while (true) {
     const peers = await ipfs.swarm.peers()
     const connected = peers.some(p => p.addr.toString() === bootstrapAddr)
     if (connected) break
     log('Waiting to be connected to: ' + bootstrapAddr)
     await new Promise(resolve => setTimeout(resolve, 1000))
   }

   log('Connected!')

   const cid = 'QmT78zSuBmuS4z925WZfrqQ1qHaJ56DQaTfyMUF7F8ff5o'
   log(`Running ipfs.cat('${cid}')`)

   const data = await ipfs.cat(cid)
   log(`Content of ${cid}:`, data.toString())
  })
</script>
</body>
</html>

Note you'll have to configure your locally running IPFS node to listen on websockets by adding "/ip4/127.0.0.1/tcp/9999/ws" to Addresses.Swarm in your config.

To enable the debugging you see in the video, open the browser console, paste in the following and then refresh:

localStorage.debug = '*,-latency-monitor*,-libp2p:connection-manager'
Stebalien commented 5 years ago

I can confirm this on that preloader, even with go-ipfs <-> go-ipfs w/o websockets.

That preload node is running 0.4.18 so my best guess is that it's deadlocked somewhere.

Stebalien commented 5 years ago

It looks like there are two preload nodes with the same peer ID. That could easily explain everything.

alanshaw commented 5 years ago

I've verified this is now working.