Closed anarkrypto closed 1 year ago
This does not happen when running ipfs daemon from snap
Kubo version: 0.16.0-38117db6f Repo version: 12 System version: amd64/linux Golang version: go1.19
This does not happen when running ipfs daemon from snap
Kubo version: 0.16.0-38117db6f Repo version: 12 System version: amd64/linux Golang version: go1.19
As mentioned this happens with kubo 0.17. I am facing the same issue. This is not an issue with snap or not but with the new libp2p code i guess which was turned on in 0.17. I have gone back to 0.16 for the moment.
I have tried to increase the inbound connection limits to no avail
I accidentally changed the resource in the wrong server, so changing the inbound connection value does work.
So changing the inbound connections to 1024 cured this for me. So add to your .ipfs/config in the "Swarm" block the following then you can tweak as you need.
"ResourceMgr": {
"Limits": {
"System": {
"Memory": 1073741824,
"FD": 512,
"Conns": 1024,
"ConnsInbound": 1024,
"ConnsOutbound": 1024,
"Streams": 16384,
"StreamsInbound": 4096,
"StreamsOutbound": 16384
}
}
},
Also ran into this problem. My error message is:
Application error 0x0: conn-12133298: system: cannot reserve inbound connection: resource limit exceeded
We are running a customized 0.17.0 build.
We are running an experiment to measure the lookup latencies in the IPFS DHT network. For that we have deployed several, customized kubo nodes. The customization consist of just additional log messages. One of these log messages is right after the GetProviders
RPC here:
It logs the return values. I noticed that I receive a lot of the following errors:
Application error 0x0: conn-12133298: system: cannot reserve inbound connection: resource limit exceeded
Therefore I went ahead and disabled the resource manager (see config above). The error messages still stick around. Then we just deployed more beefy machines and the errors seem to be less but they still happen frequently.
It's also weird that the error message talks about an inbound connection
although I'm calling out to the remote peer 🤔 .
I see the same consistent issue.
2022-11-29T04:08:37.090Z ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 42 times with error "system: cannot reserve inbound connection: resource limit exceeded".
# ipfs --version
ipfs version 0.17.0
I was able to fix by downgrading to v0.16.0
@BigLep @lidel @galargh @ajnavarro
This error is expected when you have too many inbound connections at the System level to avoid DoS attacks. If your hardware or use case needs to support more inbound connections than the default, you can change that by doing:
# Remove custom params
ipfs config --json Swarm.ResourceMgr '{}'
# Set inbound connection limits to a custom value
ipfs config --json Swarm.ResourceMgr.Limits.System.ConnsInbound 1000
# You might want to change also the number of inbound streams
ipfs config --json Swarm.ResourceMgr.Limits.System.StreamsInbound 1000
# If your hardware configuration is able to handle more connections
# and you are hitting Transient limits, you can also change them:
ipfs config --json Swarm.ResourceMgr.Limits.Transient.ConnsInbound 1000
ipfs config --json Swarm.ResourceMgr.Limits.Transient.StreamsInbound 1000
# Remember to restart the node to apply the changes
# You can see the applied changes executing:
$ ipfs swarm limit system
$ ipfs swarm limit transient
# You can check actual resources in use:
$ ipfs swarm stats system
$ ipfs swarm stats transient
The error is followed by a link: Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
There you can learn about all the different knobs to tune ResourceManager, but the more important ones are ConnsInbound and StreamsInbound.
I'm facing the same issue. looking at the stats and comparing with the limits I have it's not even touches the limits but still seeing this error in my logs.
/ # ipfs swarm stats system
{
"System": {
"Conns": 563,
"ConnsInbound": 0,
"ConnsOutbound": 563,
"FD": 125,
"Memory": 44040288,
"Streams": 868,
"StreamsInbound": 55,
"StreamsOutbound": 813
}
}
/ # ipfs swarm limit system
{
"Conns": 1024,
"ConnsInbound": 1024,
"ConnsOutbound": 1024,
"FD": 4512,
"Memory": 1073741824,
"Streams": 16384,
"StreamsInbound": 4096,
"StreamsOutbound": 16384
}
I'm running the IPFS v0.17.0
@rotarur can you paste the error that you are having? Your node might be hitting another RM level limit, like transient.
I'm running into this issue after upgrading to 0.17.0. Almost continuous in logs...
Nov 29 13:18:36 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:18:36.217-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 261 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Nov 29 13:18:36 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:18:36.218-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
Nov 29 13:18:46 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:18:46.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 342 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Nov 29 13:18:46 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:18:46.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
Nov 29 13:18:56 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:18:56.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 322 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Nov 29 13:18:56 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:18:56.217-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
Nov 29 13:19:06 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:19:06.215-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 396 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Nov 29 13:19:06 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:19:06.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
Nov 29 13:19:16 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:19:16.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 426 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Nov 29 13:19:16 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:19:16.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
Nov 29 13:19:26 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:19:26.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 437 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Nov 29 13:19:26 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:19:26.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
Nov 29 13:19:36 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:19:36.216-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 387 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Nov 29 13:19:36 ipfspri.discord.local ipfs[239746]: 2022-11-29T13:19:36.219-0600 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
$ ipfs swarm limit system
{
"Conns": 4611686018427388000,
"ConnsInbound": 123,
"ConnsOutbound": 4611686018427388000,
"FD": 4096,
"Memory": 1999292928,
"Streams": 4611686018427388000,
"StreamsInbound": 1977,
"StreamsOutbound": 4611686018427388000
}
$ ipfs swarm limit transient
{
"Conns": 4611686018427388000,
"ConnsInbound": 46,
"ConnsOutbound": 4611686018427388000,
"FD": 1024,
"Memory": 158466048,
"Streams": 4611686018427388000,
"StreamsInbound": 247,
"StreamsOutbound": 4611686018427388000
}
$ ipfs swarm stats system
{
"System": {
"Conns": 213,
"ConnsInbound": 123,
"ConnsOutbound": 90,
"FD": 38,
"Memory": 5914624,
"Streams": 197,
"StreamsInbound": 80,
"StreamsOutbound": 117
}
}
$ ipfs swarm stats transient
{
"Transient": {
"Conns": 0,
"ConnsInbound": 0,
"ConnsOutbound": 0,
"FD": 0,
"Memory": 0,
"Streams": 1,
"StreamsInbound": 0,
"StreamsOutbound": 1
}
}
soo. it looks like when the ResourceMgr limits are undefined ipfs config --json Swarm.ResourceMgr '{}'
random limits get set?
Hm. I set defined limits for all the "random values", and still seeing random values after restarting IPFS. It looks like maybe some memory overflow...
config:
"ResourceMgr": {
"Limits": {
"System": {
"Conns": 2048,
"ConnsInbound": 1024,
"ConnsOutbound": 1024,
"FD:": 8192,
"Streams:": 16384,
"StreamsInbound:": 4096,
"StreamsOutbound:": 16384
}
}
},
$ ipfs swarm limit system
{
"Conns": 2048,
"ConnsInbound": 1024,
"ConnsOutbound": 1024,
"FD": 4096,
"Memory": 1999292928,
"Streams": 4611686018427388000,
"StreamsInbound": 1977,
"StreamsOutbound": 4611686018427388000
}
I also seem to be experiencing this even though I have the resource manager disabled
@kallisti5 please check your configuration. It is wrong. Remove :
from the variable name.
Also, it is not a memory overflow, it is the max value (like not having limits)
@2color how did you disable RM? ipfs config --json Swarm.ResourceMgr.Enabled false
and restarting the daemon?
@ajnavarro
my logs are always the same and I don't have the documentation link, weird
ipfs 2022-11-30T11:37:52.039Z INFO net/identify identify/id.go:369 failed negotiate identify protocol with peer {"peer": "12D3KooWMTa2XzV7thiUSKVKUfUYtBGiV7T3fGjayy7voHVKbjAF", "error": "Application error 0x0: conn-3607345: system: cannot reserve inbound connection: resource limit exceeded"}
ipfs 2022-11-30T11:37:52.039Z WARN net/identify identify/id.go:334 failed to identify 12D3KooWMTa2XzV7thiUSKVKUfUYtBGiV7T3fGjayy7voHVKbjAF: Application error 0x0: conn-3607345: system: cannot reserve inbound connection: resource limit exceeded
The transient connections are not used
/ # ipfs swarm limit transient
{
"Conns": 4611686018427388000,
"ConnsInbound": 1024,
"ConnsOutbound": 1024,
"FD": 131072,
"Memory": 521011200,
"Streams": 4611686018427388000,
"StreamsInbound": 592,
"StreamsOutbound": 4611686018427388000
}
/ # ipfs swarm stats transient
{
"Transient": {
"Conns": 0,
"ConnsInbound": 0,
"ConnsOutbound": 0,
"FD": 0,
"Memory": 0,
"Streams": 1,
"StreamsInbound": 0,
"StreamsOutbound": 1
}
}
My server is big enough for the IPFS and it has plenty of resources to use
@rotarur are you getting errors like Resource limits were exceeded 261 times with error...
? Can you paste them here to see the RM level we are hitting?
If there are no errors like these, it is a different problem.
@ajnavarro LOL. I think you just found the issue.
ipfs config --json Swarm.ResourceMgr.Limits.System.FD: 8192
That's the command I used to set FD Isn't FD a reserved var in golang?
EDIT: Nevermind. I just realized the syntax is indeed ipfs config ... without the : So, it looks like a little validation needs to happen here, and ipfs can't handle an empty or invalid ResourceMgr limits?
@ajnavarro I don't have any error like Resource limits were exceeded 261 times with error...
Can you configure the number of connections according to the protocol priority /p2p/id/delta/1.0.0 /ipfs/id/1.0.0 /ipfs/id/push/1.0.0 /ipfs/ping/1.0.0 /libp2p/circuit/relay/0.1.0 /libp2p/circuit/relay/0.2.0/stop /ipfs/lan/kad/1.0.0 /libp2p/autonat/1.0.0 /ipfs/bitswap/1.2.0 /ipfs/bitswap/1.1.0 /ipfs/bitswap/1.0.0 /ipfs/bitswap /meshsub/1.1.0 /meshsub/1.0.0 /floodsub/1.0.0 /x/ /asmb/maons/1.0.0
Potentially a controversial take, but this feels like UX problem, and not a problem with default limits.
Printing error every 10 seconds when any limit is hit is bit hardcore. We have limits for a reason. They are feature. This constant ERROR messaging makes them feel like "error that needs to be solved by raising/removing limits", which is imo UX antipattern.
:point_right: ResourceMgr protecting user and working as expected should not look like ERROR.
Quick ideas that would remove the need for relaxing default limits:
Resource limits were exceeded
with ResourceMgr protected node from exceeding resource limits
Swarm.ResourceMgr.VerboseEnforcement
Flag set to true
by default, when set to false
removes ERRORs or moves these messages to WARN or DEBUG level.
To remove this message set Swarm.ResourceMgr.VerboseEnforcement to false
(most users won't be able to reason what limits should be changed, and would like the message to go away – we would want to suppress log spam it in GUI apps like ipfs-desktop and Brave too).Potentially a controversial take: maybe there is an UX solution to this?
Printing error every 10 seconds when any limit is hit is bit hardcore. We have limits for a reason. They are feature. This constant ERROR messaging makes them feel like "error that needs to be solved by raising/removing limits", which is imo UX antipattern.
:point_right: ResourceMgr protecting user and working as expected should not look like ERROR.
@lidel if we bump errors to be printed every hour, you won't notice when you really are hitting limits. Let's say that you hit limits at 1:01 PM then until 2:00 PM you won't have any kind of information, and the limits at that time are just fine, but you are having a warning or an error because you hit limits one hour away.
We can make it possible to silence the error output, but the problem won't disappear. Nodes with better hardware will struggle to handle incoming connections even if they have enough real resources, and small nodes will be like they don't have any resource manager active because default limits will be too high.
My proposal is to set default resource manager limits per peer only by default. It is the only limit that we can set by default knowing that will be right for any user and any hardware: https://github.com/ipfs/kubo/pull/9443 The ideal solution will be to limit to only one connection per IP but it is something that is not straightforward.
Ack on reporting this early – fair enough. For now, proposed cosmetic message adjustment in https://github.com/ipfs/kubo/pull/9444 but we still suggest user should raise limit – and it is the only way for log spam to go away.
I am not sold on per IP/peerid limits being enough. Adversary could:
Having a default global limit for incoming connections is really useful.
Unsure if there is a silver bullet, but UX solution feels way less risky
I would add Swarm.ResourceMgr.VerboseEnforcement
flag as a way to move messages out of ERROR log level, and keep global limit, just to be safe, but maybe there is a better way?
@lidel any idea though on why i'm seeing "random" limits when ResourceMgr
is {}
on upgrades?
That feels like a bug. When ResourceMgr
is {}
, the values seemingly are set randomly. Thinking it's an upgrade issue.
This bug can be reproduced by:
ipfs config --json Swarm.ResourceMgr '{}'
ipfs swarm limit system
Ok, there's lots to unpack here...
Below are the problems I'm seeing...
https://github.com/ipfs/kubo/issues/9432#issuecomment-1327647257 and other comments said they disabled the resource manager but are still seeing messages in the logs. In that commend, we can see it's disabled in config:
"Swarm": {
"ResourceMgr": {
"Enabled": false
},
4611686018427388000 is actually not a magic value. It is effectively "infinity" and is defined here: https://github.com/ipfs/kubo/blob/master/core/node/libp2p/rcmgr_defaults.go#L15
Swarm.ResourceMgr
is set to {}
I believe in this case we are setting default values as described in https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr.
There is confusion about what messages like mean
"system: cannot reserve inbound connection: resource limit exceeded"
. For this example, it means Swarm.ResourceMgr.Limits.System.ConnsInbound
is exceeded. It would be nice if the value from ipfs swarm limit system
was included.
When a resource limit is hit, we point users to https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr. It's clear from the feedback here that the docs there aren't actionable enough.
I don't think we should discuss this more or pursue it. As discussed in https://github.com/ipfs/kubo/issues/9432#issuecomment-1334482153, it is ineffective and impacts NATs (especially large organizations/enterprises which have all their traffic coming from behind a NAT).
There is good commentary on this in https://github.com/ipfs/kubo/issues/9432#issuecomment-1334160936.
I agree with this sentiment in general. go-libp2p bounding the resources it uses is generally a feature, and the presence of a message doesn't necessarily mean there's a bug.
That said, if by default our limits are crazy low, then I would call it a bug. For example, if Swarm.ResourceMgr.Limits.System.ConnsInbound
was set to "1" by default, I would consider it a bug because this would mean we'd only allow 1 inbound connection.
Using https://github.com/ipfs/kubo/issues/9432#issuecomment-1331177613 as an example, Swarm.ResourceMgr.Limits.System.ConnsInbound
is set to 123. This is derived from Swarm.ResourceMgr.MaxMemory
. I assume @kallisti5 didn't set a MaxMemroy value and the default of TOTAL_SYSTEM_MEMORY]/8
was used per https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgrmaxmemory. (In this case TOTAL_SYSTEM_MEMORY looks to be around ~16GB as 1999292928*8/(1024*1024) = ~15,253
)
For all people who are having resource manager errors, could you try executing the following command? ipfs config Swarm.ResourceMgr.MaxMemory "HALF_TOTAL_MEMORY"
where HALF_TOTAL_MEMORY
is a string value like 16GB
.
Complete command:
ipfs config Swarm.ResourceMgr.MaxMemory "16GB"
The default value right now is 1/8 of the total memory. Setting this value to 1/2 of the entire node memory will increase the number of inbound connections allowed and hopefully reduce resource manager log errors.
Note that after executing the command, you need to restart the node to take effect.
Friendly ping for anyone affected here to please try increasing Swarm.ResourceMgr.MaxMemory
as directed https://github.com/ipfs/kubo/issues/9432#issuecomment-1337133633. We'd like to get signal on how much this alleviates problems given we're intending to including fixes in the 0.18 RC for 2022-12-08.
@BigLep: Use case on this for consideration. I run multiple very large (some 2TB+) ipfs servers on VM's and sometimes change the memory allocated to them.
Fixing the memory to half of the current memory is something that I'm reluctant to do because I may have a server using 16GB, but due to host hardware constraints from time to time I may need to reduce it to 8GB, or expand it.
I would desire that service running on the system self adjust for this change and utilize the available memory
Setting the Connection Limits based on system memory is a great idea. However in my case, these are dedicated ipfs servers so the 1/8 of system memory default is too small. On the other hand 1/2 of system memory is too big for the ipfs server I'm running on my development workstation.
A literal macro type setting such as, HALF_TOTAL_MEMORY, EIGHTH_TOTAL_MEMORY seems like would be optimal for my use cases rather than fixing these limits in the config file.
init profiles for Server could apply HALF_TOTAL_MEMORY, and the default could remain at eighth.
ipfs config --json Swarm.ResourceMgr '{}'
ipfs config Swarm.ResourceMgr.MaxMemory "8GB"
Results in:
$ ipfs swarm limit system
{
"Conns": 4611686018427388000,
"ConnsInbound": 540,
"ConnsOutbound": 4611686018427388000,
"FD": 32768,
"Memory": 15999586304,
"Streams": 4611686018427388000,
"StreamsInbound": 8653,
"StreamsOutbound": 4611686018427388000
}
I still seeing resourcemanager errors.
$ ipfs swarm stats system
{
"System": {
"Conns": 768,
"ConnsInbound": 537,
"ConnsOutbound": 231,
"FD": 589,
"Memory": 64024608,
"Streams": 567,
"StreamsInbound": 250,
"StreamsOutbound": 317
}
}
I also got this message during startup, which I think is new: ipfs[2738]: 2022/12/06 10:16:44 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
System memory is 16GB
Thanks for sharing @Derrick- .
"Memory": 15999586304,
This seems odd given ipfs config Swarm.ResourceMgr.MaxMemory "8GB"
. It looks like 16GB was actually being passed.
I still seeing resourcemanager errors.
It would be helpful to know the error message so can confirm, but it looks like it would be for System.ConnsInbound (which is a value scaled based on Swarm.ResourceMgr.MaxMemory.
I also got this message during startup, which I think is new
I personally don't know what this is, but it's likely separate and won't be triaged here.
We're tracking the various followups we're active on here: https://github.com/ipfs/kubo/issues/9442 . One thing we expect will help is to lower your ConnMgr limits below System.ConnsInbound. There is some discussion about this in https://github.com/ipfs/kubo/pull/9468 (search for "How does the resource manager (ResourceMgr) relate to the connection manager (ConnMgr)?")
@BigLep Here's updated stats after a few hours of runtime. I don't understand the Memory limit result, it does seem to claim 16GB.
$ ipfs swarm stats system
{
"System": {
"Conns": 776,
"ConnsInbound": 447,
"ConnsOutbound": 329,
"FD": 507,
"Memory": 167321696,
"Streams": 1194,
"StreamsInbound": 594,
"StreamsOutbound": 600
}
}
$ ipfs swarm limit system
{
"Conns": 4611686018427388000,
"ConnsInbound": 540,
"ConnsOutbound": 4611686018427388000,
"FD": 32768,
"Memory": 15999586304,
"Streams": 4611686018427388000,
"StreamsInbound": 8653,
"StreamsOutbound": 4611686018427388000
}
And to confirm, here's the excerpt from my config file under "Swarm":
"ResourceMgr": {
"MaxMemory": "8GB"
}
The service was restarted after the config file change this morning, and the ConnsInbound limit did increase from a previously hardcoded 500, and even previously default level of < 200 after ResourceMgr was cleared and only MaxMemory was set.
And here are the console errors from journal:
Dec 06 21:19:39 video2 ipfs[2738]: 2022-12-06T21:19:39.854-0500 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 15 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Dec 06 21:19:39 video2 ipfs[2738]: 2022-12-06T21:19:39.854-0500 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
Dec 06 21:19:49 video2 ipfs[2738]: 2022-12-06T21:19:49.868-0500 ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 12 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Dec 06 21:19:49 video2 ipfs[2738]: 2022-12-06T21:19:49.868-0500 ERROR resourcemanager libp2p/rcmgr_logging.go:57 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr
I'm not surprised to get more connections to this server than I can handle, there's a lot of good stuff here, I agree with previous sentiments that these errors are too much noise though, also since they are level of ERROR.
yeah this error is noisy and my system limits are never reached as well:
# ipfs swarm stats system; ipfs swarm limit system
{
"System": {
"Conns": 632,
"ConnsInbound": 2,
"ConnsOutbound": 630,
"FD": 183,
"Memory": 71340064,
"Streams": 1082,
"StreamsInbound": 77,
"StreamsOutbound": 1005
}
}
{
"Conns": 1024,
"ConnsInbound": 2048,
"ConnsOutbound": 1024,
"FD": 4512,
"Memory": 1073741824,
"Streams": 16384,
"StreamsInbound": 4096,
"StreamsOutbound": 16384
}
Sharing some more information. Should I enable the resource manager again as suggested in https://github.com/ipfs/kubo/issues/9432#issuecomment-1337133633?
4:43:35.743Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWBbgvmGKKwUtr7dSvaZTo4kZc3tSkgYGGjKC96Mbft4Jn","error":"Application error 0x0: conn-4801317: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:35.744Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWBbgvmGKKwUtr7dSvaZTo4kZc3tSkgYGGjKC96Mbft4Jn: Application error 0x0: conn-4801317: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:35.783Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWBbgvmGKKwUtr7dSvaZTo4kZc3tSkgYGGjKC96Mbft4Jn","error":"Application error 0x0: conn-4801318: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:35.783Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWBbgvmGKKwUtr7dSvaZTo4kZc3tSkgYGGjKC96Mbft4Jn: Application error 0x0: conn-4801318: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:35.818Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWBbgvmGKKwUtr7dSvaZTo4kZc3tSkgYGGjKC96Mbft4Jn","error":"Application error 0x0: conn-4801319: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:35.818Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWBbgvmGKKwUtr7dSvaZTo4kZc3tSkgYGGjKC96Mbft4Jn: Application error 0x0: conn-4801319: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:35.857Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWGikAcdxMVbZetD5E1GJgWEtTM9oVwjfQKXkuQzRk9Xuo","error":"Application error 0x0: conn-393542: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:35.857Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWGikAcdxMVbZetD5E1GJgWEtTM9oVwjfQKXkuQzRk9Xuo: Application error 0x0: conn-393542: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:35.936Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWPvhLFyxpgMGMKQ2TUQm4JKdnb7QLb16C3z7xPyauZ1tm","error":"Application error 0x0: conn-8013505: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:35.937Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWPvhLFyxpgMGMKQ2TUQm4JKdnb7QLb16C3z7xPyauZ1tm: Application error 0x0: conn-8013505: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:35.981Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWLTgLVtTXANfnsibLb4tRQ3dVGn8WQp1rDgxBqyG8JczL","error":"Application error 0x0: conn-2481306: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:35Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:35.983Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWLTgLVtTXANfnsibLb4tRQ3dVGn8WQp1rDgxBqyG8JczL: Application error 0x0: conn-2481306: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:36.019Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWBAf6guqxSuGRdJoCSBfXhXhz1LfgqBgJnyzJkRZa3MAs","error":"Application error 0x0: conn-737991: system: cannot reserve connection: resource limit exceeded"}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:36.019Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWBAf6guqxSuGRdJoCSBfXhXhz1LfgqBgJnyzJkRZa3MAs: Application error 0x0: conn-737991: system: cannot reserve connection: resource limit exceeded"}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:36.219Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWBAf6guqxSuGRdJoCSBfXhXhz1LfgqBgJnyzJkRZa3MAs","error":"Application error 0x0: conn-738007: system: cannot reserve connection: resource limit exceeded"}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:36.220Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWBAf6guqxSuGRdJoCSBfXhXhz1LfgqBgJnyzJkRZa3MAs: Application error 0x0: conn-738007: system: cannot reserve connection: resource limit exceeded"}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:36.248Z","logger":"canonical-log","caller":"swarm/swarm_dial.go:487","msg":"CANONICAL_PEER_STATUS: peer=12D3KooWLTgLVtTXANfnsibLb4tRQ3dVGn8WQp1rDgxBqyG8JczL addr=/ip4/188.166.184.94/udp/4001/quic sample_rate=100 connection_status=\"established\" dir=\"outbound\""}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:36.419Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWBAf6guqxSuGRdJoCSBfXhXhz1LfgqBgJnyzJkRZa3MAs","error":"Application error 0x0: conn-738018: system: cannot reserve connection: resource limit exceeded"}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:36.420Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWBAf6guqxSuGRdJoCSBfXhXhz1LfgqBgJnyzJkRZa3MAs: Application error 0x0: conn-738018: system: cannot reserve connection: resource limit exceeded"}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"info","ts":"2022-12-07T14:43:36.505Z","logger":"net/identify","caller":"identify/id.go:369","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWLTgLVtTXANfnsibLb4tRQ3dVGn8WQp1rDgxBqyG8JczL","error":"Application error 0x0: conn-2481310: system: cannot reserve inbound connection: resource limit exceeded"}
2022-12-07T14:43:36Z app[7b585d20] fra [info]{"level":"warn","ts":"2022-12-07T14:43:36.505Z","logger":"net/identify","caller":"identify/id.go:334","msg":"failed to identify 12D3KooWLTgLVtTXANfnsibLb4tRQ3dVGn8WQp1rDgxBqyG8JczL: Application error 0x0: conn-2481310: system: cannot reserve inbound connection: resource limit exceeded"}
/ # ipfs version
ipfs version 0.17.0
/ # ipfs config show
{
"API": {
"HTTPHeaders": {}
},
"Addresses": {
"API": [
"/ip4/0.0.0.0/tcp/5001",
"/ip6/::/tcp/5001"
],
"Announce": [],
"AppendAnnounce": [
"/ip4/168.220.93.39/tcp/4001",
"/ip4/168.220.93.39/tcp/4002/ws",
"/dns4/my-ipfs-node.fly.dev/tcp/443/wss"
],
"Gateway": "/ip4/0.0.0.0/tcp/8080",
"NoAnnounce": [
"/ip4/10.0.0.0/ipcidr/8",
"/ip4/100.64.0.0/ipcidr/10",
"/ip4/169.254.0.0/ipcidr/16",
"/ip4/172.16.0.0/ipcidr/12",
"/ip4/192.0.0.0/ipcidr/24",
"/ip4/192.0.2.0/ipcidr/24",
"/ip4/192.168.0.0/ipcidr/16",
"/ip4/198.18.0.0/ipcidr/15",
"/ip4/198.51.100.0/ipcidr/24",
"/ip4/203.0.113.0/ipcidr/24",
"/ip4/240.0.0.0/ipcidr/4",
"/ip6/100::/ipcidr/64",
"/ip6/2001:2::/ipcidr/48",
"/ip6/2001:db8::/ipcidr/32",
"/ip6/fc00::/ipcidr/7",
"/ip6/fe80::/ipcidr/10"
],
"Swarm": [
"/ip4/0.0.0.0/tcp/4001",
"/ip4/0.0.0.0/tcp/4002/ws",
"/ip4/0.0.0.0/udp/4003/quic/webtransport",
"/ip6/::/tcp/4001",
"/ip6/::/tcp/4002/ws",
"/ip6/::/udp/4003/quic/webtransport",
"/ip4/0.0.0.0/udp/4001/quic",
"/ip6/::/udp/4001/quic"
]
},
"AutoNAT": {},
"Bootstrap": [
"/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
"/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
"/ip4/104.131.131.82/udp/4001/quic/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb"
],
"DNS": {
"Resolvers": {}
},
"Datastore": {
"BloomFilterSize": 0,
"GCPeriod": "1h",
"HashOnRead": false,
"Spec": {
"mounts": [
{
"child": {
"path": "blocks",
"shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
"sync": true,
"type": "flatfs"
},
"mountpoint": "/blocks",
"prefix": "flatfs.datastore",
"type": "measure"
},
{
"child": {
"compression": "none",
"path": "datastore",
"type": "levelds"
},
"mountpoint": "/",
"prefix": "leveldb.datastore",
"type": "measure"
}
],
"type": "mount"
},
"StorageGCWatermark": 90,
"StorageMax": "10GB"
},
"Discovery": {
"MDNS": {
"Enabled": false
}
},
"Experimental": {
"AcceleratedDHTClient": false,
"FilestoreEnabled": false,
"GraphsyncEnabled": false,
"Libp2pStreamMounting": false,
"P2pHttpProxy": false,
"StrategicProviding": false,
"UrlstoreEnabled": false
},
"Gateway": {
"APICommands": [],
"HTTPHeaders": {
"Access-Control-Allow-Headers": [
"X-Requested-With",
"Range",
"User-Agent"
],
"Access-Control-Allow-Methods": [
"GET"
],
"Access-Control-Allow-Origin": [
"*"
]
},
"NoDNSLink": false,
"NoFetch": false,
"PathPrefixes": [],
"PublicGateways": null,
"RootRedirect": "",
"Writable": false
},
"Identity": {
"PeerID": "12D3KooWAp58z5DeiQSVUXdeqgyLjvkcxgph9Pn2xZ9D1yWzHPCV"
},
"Internal": {},
"Ipns": {
"RecordLifetime": "",
"RepublishPeriod": "",
"ResolveCacheSize": 128
},
"Migration": {
"DownloadSources": [],
"Keep": ""
},
"Mounts": {
"FuseAllowOther": false,
"IPFS": "/ipfs",
"IPNS": "/ipns"
},
"Peering": {
"Peers": null
},
"Pinning": {
"RemoteServices": {}
},
"Plugins": {
"Plugins": null
},
"Provider": {
"Strategy": ""
},
"Pubsub": {
"DisableSigning": false,
"Router": ""
},
"Reprovider": {
"Interval": "12h",
"Strategy": "all"
},
"Routing": {
"Methods": null,
"Routers": null,
"Type": "dht"
},
"Swarm": {
"AddrFilters": [
"/ip4/10.0.0.0/ipcidr/8",
"/ip4/100.64.0.0/ipcidr/10",
"/ip4/169.254.0.0/ipcidr/16",
"/ip4/172.16.0.0/ipcidr/12",
"/ip4/192.0.0.0/ipcidr/24",
"/ip4/192.0.2.0/ipcidr/24",
"/ip4/192.168.0.0/ipcidr/16",
"/ip4/198.18.0.0/ipcidr/15",
"/ip4/198.51.100.0/ipcidr/24",
"/ip4/203.0.113.0/ipcidr/24",
"/ip4/240.0.0.0/ipcidr/4",
"/ip6/100::/ipcidr/64",
"/ip6/2001:2::/ipcidr/48",
"/ip6/2001:db8::/ipcidr/32",
"/ip6/fc00::/ipcidr/7",
"/ip6/fe80::/ipcidr/10"
],
"ConnMgr": {},
"DisableBandwidthMetrics": false,
"DisableNatPortMap": true,
"EnableHolePunching": true,
"RelayClient": {
"Enabled": true
},
"RelayService": {},
"ResourceMgr": {
"Enabled": false
},
"Transports": {
"Multiplexers": {},
"Network": {
"WebTransport": true,
"Websocket": true
},
"Security": {}
}
}
}
/ # ipfs swarm stats system
{
"System": {
"Conns": 0,
"ConnsInbound": 0,
"ConnsOutbound": 0,
"FD": 0,
"Memory": 0,
"Streams": 0,
"StreamsInbound": 0,
"StreamsOutbound": 0
}
}
/ # ipfs swarm limit system
Error: missing ResourceMgr: make sure the daemon is running with Swarm.ResourceMgr.Enabled
/ # ipfs swarm peers | wc -l
460
@2color Yes. These errors are from other peers you are connected to. You should enable ResourceManager to avoid DoS attacks. Here are all the pending issues on RM:
This one is the specific one for your case:
@Derrick-
I don't understand the Memory limit result, it does seem to claim 16GB.
Doh - that's a bug. It's being fixed here: https://github.com/ipfs/kubo/pull/9470
@rotarur
yeah this error is noisy and my system limits are never reached as well:
ACK on this "error" being noisy. This is being tracked in https://github.com/ipfs/kubo/issues/9442 and handled in https://github.com/ipfs/kubo/pull/9472
That said, in your output you only shared the "system" scope. There are other scopes that could be exceeded. Ideally you would be able to do to ipfs swarm stats --min-used-limit-perc=90 all
to pinpoint this, but there is a bug (doh!): https://github.com/ipfs/kubo/issues/9473
In the interim, you can inspect the log message more to understand what scope and what limit within it is being breached. We have some docs drafted in https://github.com/ipfs/kubo/pull/9468 to help with this. Search for "What do these "Protected from exceeding resource limits" log messages mean?"
@2color Yeah, this is understandably confusing. To help alleviate until things are clarified on the go-libp2p side, we have docs in https://github.com/ipfs/kubo/pull/9468. Search for "What are the "Application error ... cannot reserve ..." messages?". Given the challenges we've been having with resource manager enablement in Kubo, we'd certainly welcome your feedback on that PR.
Protected from exceeding resource limits" log messages mean?
thanks @BigLep ❤️ Your documentation helped me to understand where this error comes from in this line
This can be confusing, but these `Application error ... cannot reserve ...` messages can occur even if your local node has the resoure manager disabled.
So my errors are from remote nodes and not mine. How this error can be addressed?
I accidentally changed the resource in the wrong server, so changing the inbound connection value does work.
So changing the inbound connections to 1024 cured this for me. So add to your .ipfs/config in the "Swarm" block the following then you can tweak as you need.
"ResourceMgr": { "Limits": { "System": { "Memory": 1073741824, "FD": 512, "Conns": 1024, "ConnsInbound": 1024, "ConnsOutbound": 1024, "Streams": 16384, "StreamsInbound": 4096, "StreamsOutbound": 16384 } } },
add it without last comma as it will make errors with config file
"ResourceMgr": {
"Limits": {
"System": {
"Memory": 1073741824,
"FD": 512,
"Conns": 1024,
"ConnsInbound": 1024,
"ConnsOutbound": 1024,
"Streams": 16384,
"StreamsInbound": 4096,
"StreamsOutbound": 16384
}
}
}
I'm going to resolve this since we aren't getting new reports here and we have the fixes mostly handled (and at least being tracked in https://github.com/ipfs/kubo/issues/9442 ). If after 0.18 we get new issues, we'll coalesce and create additional resulting issues.
I'm still seeing this after updating to 0.18;
My setup is a publicly-routable node, with a system inboundconn limit of 5000, which this does not hit at all.
None of my limits are getting even close to being hit, and i'm also seeing the above immediately after the daemon has finished booting;
Jan 23 20:40:02 infinistore ipfs[9783]: API server listening on /ip4/127.0.0.1/tcp/5001
Jan 23 20:40:02 infinistore ipfs[9783]: WebUI: http://127.0.0.1:5001/webui
Jan 23 20:40:02 infinistore ipfs[9783]: 2023-01-23T20:40:02.637+0100 WARN net/identify identify/id.go:334 failed to identify 12D3KooWMzGDXiDayMjvYNqRcgpyixCYKrtT71ehoVEDt1VBQpwk: stream reset
Jan 23 20:40:02 infinistore ipfs[9783]: 2023-01-23T20:40:02.996+0100 WARN net/identify identify/id.go:334 failed to identify QmdwQTkGHb6ewS4A9XYtcWkuC9GvFGKBiPJ2EyrLeNAqWb: stream reset
Jan 23 20:40:03 infinistore ipfs[9783]: 2023-01-23T20:40:03.162+0100 WARN net/identify identify/id.go:334 failed to identify QmNyLtNKnLXDkLssibaKdZriVMjbsGajZfTL34pt23AzGL: stream reset
Jan 23 20:40:03 infinistore ipfs[9783]: 2023-01-23T20:40:03.202+0100 WARN net/identify identify/id.go:334 failed to identify Qmbut9Ywz9YEDrz8ySBSgWyJk41Uvm2QJPhwDJzJyGFsD6: Application error 0x0 (remote): conn-5019347: system: cannot reserve inbound connection: resource limit exceeded
Jan 23 20:40:03 infinistore ipfs[9783]: 2023-01-23T20:40:03.294+0100 WARN net/identify identify/id.go:334 failed to identify 12D3KooWKapSEuNYwxZVnWs9uJEQSmFprwuQwzfoByx1KUJeD1XA: Application error 0x0 (remote): conn-2480692: system: cannot reserve inbound connection: resource limit exceeded
Jan 23 20:40:03 infinistore ipfs[9783]: 2023-01-23T20:40:03.303+0100 WARN net/identify identify/id.go:334 failed to identify 12D3KooWG7HY6VLRQCoipwuhBNSB7mx4tHmHubLH6v1uRpTSgbnX: Application error 0x0 (remote): conn-1684852: system: cannot reserve inbound connection: resource limit exceeded
Jan 23 20:40:03 infinistore ipfs[9783]: 2023-01-23T20:40:03.338+0100 WARN net/identify identify/id.go:334 failed to identify 12D3KooWBN47Kk6J5CFGBLxNXm1jL8MEZitYLzco9be3pAfizwTp: Application error 0x0 (remote): conn-5283888: system: cannot reserve inbound connection: resource limit exceeded
Jan 23 20:40:03 infinistore ipfs[9783]: 2023-01-23T20:40:03.372+0100 WARN net/identify identify/id.go:334 failed to identify 12D3KooWMMWmcwP6DDfwHmFp1QYZH2GcFN3SGiAz7wyts9MTcFsZ: Application error 0x0 (remote): conn-4626232: system: cannot reserve connection: resource limit exceeded
Jan 23 20:40:03 infinistore ipfs[9783]: 2023-01-23T20:40:03.384+0100 WARN net/identify identify/id.go:334 failed to identify QmRHbKAb6HuVWGWjs3SiSCAQ2m87TLBdxwsFvnNxo45BDb: stream reset
I have my log-level set to warn, and i haven't spotted any "we have suppressed 123 limit messages" or something
I'm downgrading to 0.17 again, after https://github.com/ipfs-cluster/ipfs-cluster/issues/1835 makes it not possible to use it for me.
I had the same problem and it was coming from that all the ports aren't well opened in ufw
I'm running on Debian 10, @sven-hash could you elaborate on that? What do you mean "not properly opened", so you mean that only some ports were, or that some ports had some specific filtering?
I'm running on Debian 10, @sven-hash could you elaborate on that? What do you mean "not properly opened", so you mean that only some ports were, or that some ports had some specific filtering?
The port 4001 in UDP wasn't open in the security group on VPS cloud provider side so it was not accessible and after sometime the cluster crashed. Since I open all the port correctly in UDP and TCP I never have a crash
I just checked, it doesn't seem my VPS provider (hetzner) has any firewall rules or such in place
I have since then changed the log level to error, and after 9 hours, i'm observing my pinning process going well, so for me the matter is relatively solved.
FWIW yesterday i used to occasionally ipfs ping
my own node, and observing the following;
jonathan@os-mule:~$ ipfs ping 12D3xxx
PING 12D3xxx.
Ping error: stream reset
Ping error: stream reset
Ping error: stream-12855: resource scope closed
Ping error: stream-12855: resource scope closed
Ping error: stream-12855: resource scope closed
Ping error: stream-12855: resource scope closed
Ping error: stream-12855: resource scope closed
Ping error: stream-12855: resource scope closed
Ping error: stream-12855: resource scope closed
Ping error: stream-12855: resource scope closed
Error: ping failed
This was while the system
stats were way under limits, so I don't know why the other side would close this stream early. This also happened simultaneously with a staggered pinning process, so I don't know what happened there.
Pinging now seems to go alright. Inbound conns is about 800-1000, while it reached a high of 3000 yesterday.
I wonder if due to the contention of restarting IPFS a bunch of times, the linux network buffer is overloaded. I did not get any kernel messages or the likes, but for a while, reaching the node was very unreliable.
After seeing this problem with 0.18-RC1, I upgraded to 0.18 today, and reproduced the same problem as reported by @ShadowJonathan and by me earlier on two different machines.
Machine 1 (running on fly.io): 2GB Ram, default 0.18 config
@2color @ShadowJonathan Please have a look into Theme 3: improve RM errors coming from other peers
section here: https://github.com/ipfs/kubo/issues/9442
TLDR: These errors are errors coming from remote peers that are hitting Resource Manager limits, not the local node. Note the (remote):
flag on errors before the connection manager error.
Ah, I had that suspicion, but I wasn't entirely sure if this was indeed the case. Thanks for pointing to the (remote)
marker, that makes those log entries a lot more understandable, as it was US-daylight hours, with the nodes most likely more overloaded at that time 😅
Then that addresses all my concerns for me personally, except the stall on ping during those hours, that is still a mystery to me.
The log entries could be made more clear if (remote)
is turned into (from remote)
, else it's easy to confuse this as being related to remote something, not that it is coming from the remote node. I'll echo this in the issue you linked as well.
Checklist
Installation method
built from source
Version
Config
Description
Trying to run IPFS on VPS.
Error Description: The node starts, show success swarm announcing addresses, the port 4001 is exposed externally ( I checked) and after a few seconds it closes and I got this error message:
ERROR resourcemanager libp2p/rcmgr_logging.go:53 Resource limits were exceeded 496 times with error "system: cannot reserve inbound connection: resource limit exceeded".
Then the port cannot be reached anymore
Error occurend on both Docker (20.10.21) and installation from binaries