crossplane-contrib / provider-upjet-azure

Official Azure Provider for Crossplane by Upbound.
Apache License 2.0
54 stars 69 forks source link

cannot get a terraform workspace for resource: cannot init workspace: runtime: failed to create new OS thread (have 8 already; errno=11) #430

Closed mjnovice closed 3 months ago

mjnovice commented 1 year ago

I am using provicer-azure image: xpkg.upbound.io/upbound/provider-azure:v0.19.0

What happened?

After running the controller for about 15 days with an average of 300 crossplane resources at any point of time, I see this

 ➜  cloudgen-operator git:(mj/refac-modules)  k describe dnscnamerecord.network.azure.upbound.io/wild-ci-aseks3723738                             
Name:         wild-ci-aseks3723738
Namespace:    
Labels:       <none>
Annotations:  crossplane.io/external-name: *.ci-aseks3723738
API Version:  network.azure.upbound.io/v1beta1
Kind:         DNSCNAMERecord
Metadata:
  Creation Timestamp:  2023-03-31T19:10:26Z
  Generation:          1
  Managed Fields:
    API Version:  network.azure.upbound.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:crossplane.io/external-name:
      f:spec:
        .:
        f:deletionPolicy:
        f:forProvider:
          .:
          f:record:
          f:resourceGroupName:
          f:tags:
            .:
            f:Owner:
            f:Project:
          f:ttl:
          f:zoneName:
        f:providerConfigRef:
          .:
          f:name:
    Manager:      manager
    Operation:    Update
    Time:         2023-03-31T19:10:26Z
    API Version:  network.azure.upbound.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:atProvider:
        f:conditions:
    Manager:         provider
    Operation:       Update
    Subresource:     status
    Time:            2023-03-31T20:35:36Z
  Resource Version:  143990341
  UID:               fac97e3b-d4ed-4a73-843f-cb2a08603e55
Spec:
  Deletion Policy:  Delete
  For Provider:
    Record:               ci-aseks3723738.devtest-sl-ea.infra.uipath-dev.com
    Resource Group Name:  sfdev-dns
    Tags:
      Owner:    mayank.jha@uipath.com
      Project:  Service Fabric
    Ttl:        300
    Zone Name:  devtest-sl-ea.infra.uipath-dev.com
  Provider Config Ref:
    Name:  azcp-devtestslea
Status:
  At Provider:
  Conditions:
    Last Transition Time:  2023-03-31T20:35:36Z
    Message:               connect failed: cannot get a terraform workspace for resource: cannot init workspace: runtime: failed to create new OS thread (have 8 already; errno=11)
runtime: may need to increase max user processes (ulimit -u)
fatal error: newosproc

runtime stack:
runtime.throw({0x26b97aa?, 0xc00061be38?})
  /usr/local/go/src/runtime/panic.go:992 +0x71
runtime.newosproc(0xc0006c0000)
  /usr/local/go/src/runtime/os_linux.go:182 +0x185
runtime.newm1(0xc0006c0000)
  /usr/local/go/src/runtime/proc.go:2147 +0xcf
runtime.newm(0x0?, 0xc000069900, 0x0?)
  /usr/local/go/src/runtime/proc.go:2122 +0x12c
runtime.startm(0x0, 0x1)
  /usr/local/go/src/runtime/proc.go:2305 +0xcf
runtime.wakep()
  /usr/local/go/src/runtime/proc.go:2404 +0x5a
runtime.resetspinning()
  /usr/local/go/src/runtime/proc.go:3036 +0x45
runtime.schedule()
  /usr/local/go/src/runtime/proc.go:3194 +0x25e
runtime.mstart1()
  /usr/local/go/src/runtime/proc.go:1425 +0xcd
runtime.mstart0()
  /usr/local/go/src/runtime/proc.go:1376 +0x79
runtime.mstart()
  /usr/local/go/src/runtime/asm_amd64.s:367 +0x5

goroutine 1 [runnable, locked to thread]:
github.com/aws/aws-sdk-go/aws/endpoints.init()
  /home/circleci/go/pkg/mod/github.com/aws/aws-sdk-go@v1.42.35/aws/endpoints/defaults.go:27844 +0xc1445

goroutine 18 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc000220380)
  /home/circleci/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:276 +0xad
created by go.opencensus.io/stats/view.init.0
  /home/circleci/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:34 +0x8d
: exit status 2
    Reason:  ReconcileError
    Status:  False
    Type:    Synced
Events:
  Type     Reason                   Age                  From                                                           Message
  ----     ------                   ----                 ----                                                           -------
  Warning  CannotConnectToProvider  2s (x4533 over 84m)  managed/network.azure.upbound.io/v1beta1, kind=dnscnamerecord  (combined from similar events): cannot get a terraform workspace for resource: cannot init workspace: runtime: failed to create new OS thread (have 8 already; errno=11)
runtime: may need to increase max user processes (ulimit -u)
fatal error: newosproc

runtime stack:
runtime.throw({0x26b97aa?, 0xc00010de38?})
  /usr/local/go/src/runtime/panic.go:992 +0x71
runtime.newosproc(0xc000680000)
  /usr/local/go/src/runtime/os_linux.go:182 +0x185
runtime.newm1(0xc000680000)
  /usr/local/go/src/runtime/proc.go:2147 +0xcf
runtime.newm(0x438a65?, 0xc000069900, 0x0?)
  /usr/local/go/src/runtime/proc.go:2122 +0x12c
runtime.startm(0x0, 0x1)
  /usr/local/go/src/runtime/proc.go:2305 +0xcf
runtime.wakep()
  /usr/local/go/src/runtime/proc.go:2404 +0x5a
runtime.resetspinning()
  /usr/local/go/src/runtime/proc.go:3036 +0x45
runtime.schedule()
  /usr/local/go/src/runtime/proc.go:3194 +0x25e
runtime.mstart1()
  /usr/local/go/src/runtime/proc.go:1425 +0xcd
runtime.mstart0()
  /usr/local/go/src/runtime/proc.go:1376 +0x79
runtime.mstart()
  /usr/local/go/src/runtime/asm_amd64.s:367 +0x5

goroutine 1 [runnable, locked to thread]:
regexp.onePassCopy(0xc00051a270)
  /usr/local/go/src/regexp/onepass.go:223 +0x5a
regexp.compileOnePass(0xc00051a270)
  /usr/local/go/src/regexp/onepass.go:498 +0x14e
regexp.compile({0x26d6eaf, 0x14}, 0xdb30?, 0x0)
  /usr/local/go/src/regexp/regexp.go:191 +0x98
regexp.Compile(...)
  /usr/local/go/src/regexp/regexp.go:135
github.com/aws/aws-sdk-go/aws/endpoints.glob..func5()
  /home/circleci/go/pkg/mod/github.com/aws/aws-sdk-go@v1.42.35/aws/endpoints/defaults.go:28553 +0x2c
github.com/aws/aws-sdk-go/aws/endpoints.init()
  /home/circleci/go/pkg/mod/github.com/aws/aws-sdk-go@v1.42.35/aws/endpoints/defaults.go:28555 +0xc1e98

goroutine 18 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc0001a0380)
  /home/circleci/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:276 +0xad
created by go.opencensus.io/stats/view.init.0
  /home/circleci/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:34 +0x8d
: exit status 2

How can we reproduce it?

Have a crossplane cluster running 300 azure resources with 40-50 Creates/Deletes every hour, for around 15 days.

What environment did it happen in?

jeanduplessis commented 1 year ago

@mjnovice, the provider version is quite old; the latest version is 0.30.0. Additionally, the latest provider contains improvements in managing the Terraform processes and resource utilization. I would recommend upgrading to the latest version to see if this is still a problem.

github-actions[bot] commented 3 months ago

This provider repo does not have enough maintainers to address every issue. Since there has been no activity in the last 90 days it is now marked as stale. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

github-actions[bot] commented 3 months ago

This issue is being closed since there has been no activity for 14 days since marking it as stale. If you still need help, feel free to comment or reopen the issue!