seed-hypermedia / mintter

Mintter: an app for knowledge communities. Powered by the Hypermedia protocol.
https://mintter.com
Apache License 2.0
161 stars 11 forks source link

Bug: Resource Manager limit exceeded #1509

Open juligasa opened 11 months ago

juligasa commented 11 months ago

Problem

We see traces in the backend like this

2023-11-08T12:44:25.637+0100    DEBUG   mintter/syncing syncing/syncing.go:151  SyncLoopError   {"traceID": 1699443858624473, "peer": "12D3KooWR99NHvwd1snVHgTGATPs6vYbArdp3ux8i1EBuRaXMETq", "error": "failed to sync objects: failed to connect to peer 12D3KooWR99NHvwd1snVHgTGATPs6vYbArdp3ux8i1EBuRaXMETq: failed to dial: failed to dial 12D3KooWR99NHvwd1snVHgTGATPs6vYbArdp3ux8i1EBuRaXMETq: all dials failed\n  * [/ip4/23.20.24.146/udp/4002/quic-v1/p2p/12D3KooWNmjM4sMbSkDEA6ShvjTgkrJHjMya46fhZ9PjKZ4KVZYq/p2p-circuit] error opening relay circuit: NO_RESERVATION (204)\n  * [/ip4/52.22.139.174/udp/4002/quic-v1/p2p/12D3KooWGvsbBfcbnkecNoRBM7eUTiuriDqUyzu87pobZXSdUUsJ/p2p-circuit] error opening relay circuit: NO_RESERVATION (204)\n  * [/ip4/52.22.139.174/tcp/4002/p2p/12D3KooWGvsbBfcbnkecNoRBM7eUTiuriDqUyzu87pobZXSdUUsJ/p2p-circuit] conn-14743: system: cannot reserve outbound connection: resource limit exceeded\n  * [/ip4/23.20.24.146/tcp/4002/p2p/12D3KooWNmjM4sMbSkDEA6ShvjTgkrJHjMya46fhZ9PjKZ4KVZYq/p2p-circuit] conn-14744: system: cannot reserve outbound connection: resource limit exceeded"}

and even inside the dht providing system:

2023-11-08T12:44:23.967+0100    DEBUG   dht     go-libp2p-kad-dht@v0.25.1/query.go:545  error connecting: failed to dial: failed to dial 12D3KooWEoWwonnPmpbWmbkk1dPR5QUuBaSv6f8hEfhtb8vKZKDh: all dials failed
  * [/ip4/207.244.228.217/udp/4001/quic] QUIC draft-29 has been removed, QUIC (RFC 9000) is accessible with /quic-v1
  * [/ip4/207.244.228.217/udp/4001/quic-v1] conn-21733: system: cannot reserve outbound connection: resource limit exceeded
2023-11-08T12:44:23.967+0100    DEBUG   dht     go-libp2p-kad-dht@v0.25.1/dht.go:720    peer stopped dht        {"peer": "12D3KooWEoWwonnPmpbWmbkk1dPR5QUuBaSv6f8hEfhtb8vKZKDh"}

We already set this in the daemon

cfg := rcmgr.PartialLimitConfig{
    System: rcmgr.ResourceLimits{
        Streams: rcmgr.Unlimited,
        Conns:   rcmgr.Unlimited,
        FD:      rcmgr.Unlimited,
        Memory:  rcmgr.Unlimited64,
    },
    // Everything else is default. The exact values will come from `scaledDefaultLimits` above.
}

Steps to Reproduce

Run the app and try to connect to a site then wait

Expected Behavior

Not hitting resource limit failures.

Debug Info

juligasa commented 11 months ago

maybe the problem is in the relays, failing to open and outbound connection on the relay, makes client to fail the dial, changed it here cec910f808a0a1e1de3847265bde09c756158bc5

juligasa commented 11 months ago

After updating relays and gateways I still see

2023-11-10T14:46:52.096+0100    DEBUG   mintter/network mttnet/connect.go:45    ConnectFinished {"peer": "12D3KooWLo1YAp2wbLH65H7L1R7GSthzycgqJoWQHC1UemNWwKhz", "error": "failed to connect to peer 12D3KooWLo1YAp2wbLH65H7L1R7GSthzycgqJoWQHC1UemNWwKhz: failed to dial: failed to dial 12D3KooWLo1YAp2wbLH65H7L1R7GSthzycgqJoWQHC1UemNWwKhz: all dials failed\n  * [/ip4/23.20.24.146/udp/4002/quic-v1/p2p/12D3KooWNmjM4sMbSkDEA6ShvjTgkrJHjMya46fhZ9PjKZ4KVZYq/p2p-circuit] error opening relay circuit: NO_RESERVATION (204)\n  * [/ip4/52.22.139.174/udp/4002/quic-v1/p2p/12D3KooWGvsbBfcbnkecNoRBM7eUTiuriDqUyzu87pobZXSdUUsJ/p2p-circuit] error opening relay circuit: NO_RESERVATION (204)\n  * [/ip4/23.20.24.146/tcp/4002/p2p/12D3KooWNmjM4sMbSkDEA6ShvjTgkrJHjMya46fhZ9PjKZ4KVZYq/p2p-circuit] conn-354439: system: cannot reserve outbound connection: resource limit exceeded\n  * [/ip4/52.22.139.174/tcp/4002/p2p/12D3KooWGvsbBfcbnkecNoRBM7eUTiuriDqUyzu87pobZXSdUUsJ/p2p-circuit] conn-354438: system: cannot reserve outbound connection: resource limit exceeded", "Info": "{12D3KooWLo1YAp2wbLH65H7L1R7GSthzycgqJoWQHC1UemNWwKhz: [/ip4/52.22.139.174/tcp/4002/p2p/12D3KooWGvsbBfcbnkecNoRBM7eUTiuriDqUyzu87pobZXSdUUsJ/p2p-circuit /ip4/23.20.24.146/tcp/4002/p2p/12D3KooWNmjM4sMbSkDEA6ShvjTgkrJHjMya46fhZ9PjKZ4KVZYq/p2p-circuit /ip4/23.20.24.146/udp/4002/quic/p2p/12D3KooWNmjM4sMbSkDEA6ShvjTgkrJHjMya46fhZ9PjKZ4KVZYq/p2p-circuit /ip4/52.22.139.174/udp/4002/quic-v1/p2p/12D3KooWGvsbBfcbnkecNoRBM7eUTiuriDqUyzu87pobZXSdUUsJ/p2p-circuit /ip4/23.20.24.146/udp/4002/quic-v1/p2p/12D3KooWNmjM4sMbSkDEA6ShvjTgkrJHjMya46fhZ9PjKZ4KVZYq/p2p-circuit]}"}

Maybe those are peers with outdated software? Nevertheless setting everything to unlimited will open the app to DoS attacks according to doc https://github.com/ipfs/kubo/blob/master/docs/libp2p-resource-management.md#computed-default-limits