confidential-containers / cloud-api-adaptor

Ability to create Kata pods using cloud provider APIs aka the peer-pods approach
Apache License 2.0
47 stars 79 forks source link

Prevent creation of unnecessary multiple pod vms on failure #1842

Closed kartikjoshi21 closed 2 months ago

kartikjoshi21 commented 4 months ago

The CreateVM api call currently exhibits a continuous retry behavior upon encountering failures during VM creation. This leads to multiple unintended VMs being created in the resource group, even when retries are unnecessary, such as when the image is unavailable. This process persists until available network addresses are exhausted, resulting in unnecessary resource creation and potential network conflicts. This bug is opened to discuss implementing a backoff mechanism to prevent this behavior and ensure more reliable VM creation.

kartikjoshi21 commented 4 months ago

cc: @mkulke @surajssd @bpradipt

mkulke commented 4 months ago

The busy loop in CAA for "create vm" should include an exponential backoff mechanism. Beyond that we don't need any heuristics, IMO. This should be applicable to all cloud providers, busy loops calling CSP apis might lead to rate limits and impact a user's account in unfortunate ways.