flow-hydraulics / flow-wallet-api

Service for custodial wallets on Flow blockchain. This repository is currently not maintained.
https://flow-hydraulics.github.io/flow-wallet-api
Apache License 2.0
47 stars 36 forks source link

Error creating accounts when cloud kms returns PENDING_GENERATION #267

Open seitau opened 2 years ago

seitau commented 2 years ago

I was trying to create many accounts asynchronously and after it reaches certain load I saw following errors from job.

cloudkms: failed to fetch public key from KMS API: rpc error: code = FailedPrecondition desc = projects/.../locations/global/keyRings/.../cryptoKeys/flow-wallet-account-key-dde4689c-9ce1-4e96-b4ab-f5e227b1d622/cryptoKeyVersions/1 is not enabled, current state is: PENDING_GENERATION.error details: name = PreconditionFailure type = KEY_PENDING_GENERATION subj = projects/.../locations/global/keyRings/.../cryptoKeys/flow-wallet-account-key-dde4689c-9ce1-4e96-b4ab-f5e227b1d622/cryptoKeyVersions/1 desc =

This seems due to the latency of cloud kms generating many asymmetric keys. The current key creation logic does not handle this error. So retrying error job try to generate new key in kms which results in increasing load to kms.

nanuuki commented 2 years ago

Thanks for reporting this @seita-uc! Have you observed this with AWS KMS too, or just Google KMS?

seitau commented 2 years ago

@nanuuki I only use cloud kms so I'm not sure with aws kms.

seitau commented 2 years ago

According to the document,

Due to the CPU cost of generating key material, creation of an asymmetric signing or asymmetric encryption key version may take a few minutes. https://cloud.google.com/kms/docs/faq#pending_generation

nanuuki commented 2 years ago

@seita-uc I managed to reproduce this, I'll let you know once a fix has been applied :)

seitau commented 2 years ago

@nanuuki FYI I have been measuring the response time of key creation and time to wait for the key to be enabled. I figured out that keeping the request rate to 1rps solves the problem. If the CreateCryptoKey request rate to cloud kms keeps more than 1 rps, eventually cloud kms client returns timeout (default 60s) and wait time for key enabling gets longer.

seitau commented 2 years ago

Update: I managed to get better response time when I turn cloud kms keyring's region to asia from global. It does not solve the root cause of the problem but setting closer kms region rather than setting global will help avoid the error.