Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.96k stars 305 forks source link

[BUG] The configured user limit (1024) on the number of inotify instances has been reached #3714

Open sdwerwed opened 1 year ago

sdwerwed commented 1 year ago

Describe the bug Once we deploy more than 3 .NET pods in AKS we get the following error

  Unhandled exception. System.IO.IOException: The configured user limit (1024) on the number of inotify instances has been reached, or the per-process limit on the number of open file descriptors has been reached.
  at System.IO.FileSystemWatcher.StartRaisingEvents()
  at System.IO.FileSystemWatcher.StartRaisingEventsIfNotDisposed()
  at System.IO.FileSystemWatcher.set_EnableRaisingEvents(Boolean value)
  at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.TryEnableFileSystemWatcher()
  at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.CreateFileChangeToken(String filter)
  at Microsoft.Extensions.FileProviders.PhysicalFileProvider.Watch(String filter)
  at Microsoft.Extensions.Configuration.FileConfigurationProvider.<.ctor>b__1_0()
  at Microsoft.Extensions.Primitives.ChangeToken.OnChange(Func`1 changeTokenProducer, Action changeTokenConsumer)
  at Microsoft.Extensions.Configuration.FileConfigurationProvider..ctor(FileConfigurationSource source)
  at Microsoft.Extensions.Configuration.Json.JsonConfigurationSource.Build(IConfigurationBuilder builder)
  at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
  at Microsoft.Extensions.Hosting.HostBuilder.BuildAppConfiguration()
  at Microsoft.Extensions.Hosting.HostBuilder.Build()
  at xxxxxxxxxxxxxxx.Platform.AuthServer.Program.Main(String[] args) in /src/xxxxxxxxxxxxxxx.Platform.AuthServer/Program.cs:line 14

Each pod consumes about 550 inotify instances.

Expected behavior I would expect to be able to create multiple .NET pods in the AKS or I would expect to be able to configure sysctl -w fs.inotify.max_user_instances=1048576 so I can create more than 3 pods per node.
I see in the official documentation it is supported fs.inotify.max_user_watches but not fs.inotify.max_user_instances.

Possible Solution Add the fs.inotify.max_user_instances in the Linux custom OS configuration settings.

Environment (please complete the following information):

Workaround As a workaround, we have set a daemonset with root access to perform sysctl -w fs.inotify.max_user_instances=1048576. However, this solution has some implications, for example, pods will not start if daemonset will not be scheduled first, or it adds some vulnerabilities to the AKS as we do not want to run any pod with root access and is increasing the operational costs.

dinfdsooff commented 1 year ago

This is a blocker for us as well. fs.inotify.max_user_watches is supported but not fs.inotify.max_user_instances not. We would like this as soon as possible. related to this https://github.com/Azure/AKS/issues/772

ghost commented 1 year ago

Action required from @Azure/aks-pm

thiDucTran commented 1 year ago

we are also facing this issue

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

wangyira commented 1 year ago

@justindavies could you help take a look

Kenneth-Abrams commented 5 months ago

We’re running into this exact problem as well. Can we get some attention to this issue and determine if it’s an issue with dotnet vs aks

haitch commented 1 month ago

@juan-lee

allyford commented 4 weeks ago

Updating this as a feature request for "Add the fs.inotify.max_user_instances in the Linux custom OS configuration settings".