kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
111.18k stars 39.69k forks source link

Memory leak kube-apiserver 1.8.1 #54217

Closed ese closed 7 years ago

ese commented 7 years ago

/kind bug /sig api-machinery @kubernetes/sig-apimachinery-bugs

What happened: Kube-api-server is consuming more and more memory until end with all available resources

What you expected to happen: kube-api-server consuming more or less the same amount of memory since there is no change in the api requests or cluster load

How to reproduce it (as minimally and precisely as possible): Run a cluster with kops in aws (manifest provided at the end of the issue)*:

Environment:

kops manifest

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: 2017-10-16T09:11:48Z
  name: [REDACTED]
spec:
  additionalPolicies:
    node: '[{"Action":["sts:AssumeRole"],"Effect":"Allow","Resource":"*"}]'
  api:
    loadBalancer:
      type: Public
  authorization:
    rbac: {}
  channel: stable
  cloudLabels:
    environment: testing
    owner: unknown
  cloudProvider: aws
  configBase: [REDACTED]
  dnsZone: [REDACTED]
  etcdClusters:
  - enableEtcdTLS: true
    etcdMembers:
    - instanceGroup: master-eu-west-1a
      name: a
    - instanceGroup: master-eu-west-1b
      name: b
    - instanceGroup: master-eu-west-1c
      name: c
    name: main
    version: 3.1.10
  - enableEtcdTLS: true
    etcdMembers:
    - instanceGroup: master-eu-west-1a
      name: a
    - instanceGroup: master-eu-west-1b
      name: b
    - instanceGroup: master-eu-west-1c
      name: c
    name: events
    version: 3.1.10
  iam:
    legacy: false
  kubeAPIServer:
    auditLogMaxAge: 10
    auditLogMaxBackups: 1
    auditLogMaxSize: 100
    auditLogPath: /var/log/kube-apiserver-audit.log
  kubelet:
    featureGates:
      ExperimentalCriticalPodAnnotation: "true"
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubernetesVersion: 1.8.1
  masterInternalName: [REDACTED]
  masterPublicName: [REDACTED]
  networkCIDR: 172.20.0.0/16
  networking:
    canal: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 0.0.0.0/0
  subnets:
  - cidr: 172.20.32.0/19
    name: eu-west-1a
    type: Private
    zone: eu-west-1a
  - cidr: 172.20.64.0/19
    name: eu-west-1b
    type: Private
    zone: eu-west-1b
  - cidr: 172.20.96.0/19
    name: eu-west-1c
    type: Private
    zone: eu-west-1c
  - cidr: 172.20.0.0/22
    name: utility-eu-west-1a
    type: Utility
    zone: eu-west-1a
  - cidr: 172.20.4.0/22
    name: utility-eu-west-1b
    type: Utility
    zone: eu-west-1b
  - cidr: 172.20.8.0/22
    name: utility-eu-west-1c
    type: Utility
    zone: eu-west-1c
  topology:
    bastion:
      bastionPublicName: [REDACTED]
    dns:
      type: Public
    masters: private
    nodes: private
nikhita commented 7 years ago

Maybe related to https://github.com/kubernetes/kubernetes/pull/50690?

cc @sttts

sttts commented 7 years ago

This is probably a dup of https://github.com/kubernetes/kubernetes/issues/53485, fixed in https://github.com/kubernetes/kubernetes/pull/53586. @ese do you have a chance to check whether https://github.com/kubernetes/kubernetes/pull/53586 helps?

sttts commented 7 years ago

Created a cherry-pick for 1.8: #54225

mbohlool commented 7 years ago

cc @jpbetz

luxas commented 7 years ago

Duplicate of https://github.com/kubernetes/kubernetes/issues/53485

ese commented 7 years ago

@sttts I can confirm that #53586 resolve the problem thanks!