quadratic-funding / mpc-phase2-suite

The MPC suite of tools for conducting zkSNARK Phase 2 Trusted Setup ceremonies
MIT License
16 stars 6 forks source link

CloudFunctions deploy could partially failed #194

Open baumstern opened 1 year ago

baumstern commented 1 year ago

It is possible to deploy only 1 of total 27 functions. Functions deployment verification would be useful when such a case happened.

Reproduce

$ firebase deploy --only functions

=== Deploying to 'redacted'...

i  deploying functions
Running command: yarn --prefix "$RESOURCE_DIR" build
yarn run v1.22.19
$ tsc
✨  Done in 2.08s.
✔  functions: Finished running predeploy script.
i  functions: ensuring required API cloudfunctions.googleapis.com is enabled...
i  functions: ensuring required API cloudbuild.googleapis.com is enabled...
i  artifactregistry: ensuring required API artifactregistry.googleapis.com is enabled...
⚠  artifactregistry: missing required API artifactregistry.googleapis.com. Enabling now...
⚠  functions: missing required API cloudbuild.googleapis.com. Enabling now...
⚠  functions: missing required API cloudfunctions.googleapis.com. Enabling now...
✔  functions: required API cloudfunctions.googleapis.com is enabled
✔  artifactregistry: required API artifactregistry.googleapis.com is enabled
✔  functions: required API cloudbuild.googleapis.com is enabled
i  functions: preparing codebase default for deployment
i  functions: Loaded environment variables from .env.
i  functions: preparing . directory for uploading...
i  functions: packaged /mpc-phase2-suite/firebase (103.99 KB) for uploading
i  functions: packaged /mpc-phase2-suite/firebase (104.2 KB) for uploading
i  functions: ensuring required API cloudscheduler.googleapis.com is enabled...
⚠  functions: missing required API cloudscheduler.googleapis.com. Enabling now...
✔  functions: required API cloudscheduler.googleapis.com is enabled
i  functions: ensuring required API run.googleapis.com is enabled...
i  functions: ensuring required API eventarc.googleapis.com is enabled...
i  functions: ensuring required API pubsub.googleapis.com is enabled...
i  functions: ensuring required API storage.googleapis.com is enabled...
⚠  functions: missing required API eventarc.googleapis.com. Enabling now...
⚠  functions: missing required API run.googleapis.com. Enabling now...
✔  functions: required API pubsub.googleapis.com is enabled
✔  functions: required API storage.googleapis.com is enabled
✔  functions: required API eventarc.googleapis.com is enabled
✔  functions: required API run.googleapis.com is enabled
i  functions: generating the service identity for pubsub.googleapis.com...
i  functions: generating the service identity for eventarc.googleapis.com...
✔  functions: . folder uploaded successfully
i  functions: creating Node.js 16 function checkAndPrepareCoordinatorForFinalization(us-central1)...
i  functions: creating Node.js 16 function checkAndRemoveBlockingContributor(us-central1)...
i  functions: creating Node.js 16 function checkIfObjectExist(us-central1)...
i  functions: creating Node.js 16 function checkParticipantForCeremony(us-central1)...
i  functions: creating Node.js 16 function completeMultiPartUpload(us-central1)...
i  functions: creating Node.js 16 function coordinateContributors(us-central1)...
i  functions: creating Node.js 16 function createBucket(us-central1)...
i  functions: creating Node.js 16 function finalizeCeremony(us-central1)...
i  functions: creating Node.js 16 function finalizeLastContribution(us-central1)...
i  functions: creating Node.js 16 function generateGetObjectPreSignedUrl(us-central1)...
i  functions: creating Node.js 16 function generatePreSignedUrlsParts(us-central1)...
i  functions: creating Node.js 16 function initEmptyWaitingQueueForCircuit(us-central1)...
i  functions: creating Node.js 16 function makeProgressToNextContribution(us-central1)...
i  functions: creating Node.js 16 function permanentlyStoreCurrentContributionTimeAndHash(us-central1)...
i  functions: creating Node.js 16 function processSignUpWithCustomClaims(us-central1)...
i  functions: creating Node.js 16 function progressToNextContributionStep(us-central1)...
i  functions: creating Node.js 16 function refreshParticipantAfterContributionVerification(us-central1)...
i  functions: creating Node.js 16 function registerAuthUser(us-central1)...
i  functions: creating Node.js 16 function resumeContributionAfterTimeoutExpiration(us-central1)...
i  functions: creating Node.js 16 function setupCeremony(us-central1)...
i  functions: creating Node.js 16 function startCeremony(us-central1)...
i  functions: creating Node.js 16 function startMultiPartUpload(us-central1)...
i  functions: creating Node.js 16 function stopCeremony(us-central1)...
i  functions: creating Node.js 16 function temporaryStoreCurrentContributionComputationTime(us-central1)...
i  functions: creating Node.js 16 function temporaryStoreCurrentContributionMultiPartUploadId(us-central1)...
i  functions: creating Node.js 16 function temporaryStoreCurrentContributionUploadedChunkData(us-central1)...
i  functions: creating Node.js 16 function verifycontribution(us-central1)...
⚠  functions: failed to create function projects/redacted/locations/us-central1/functions/checkAndPrepareCoordinatorForFinalization
Failed to create function projects/redacted/locations/us-central1/functions/checkAndPrepareCoordinatorForFinalization
✔  functions[verifycontribution(us-central1)] Successful create operation.
0xjei commented 1 year ago

Do you happen to have any more information on why the deployment failed? BTW, yes it would be great to be able to have a sanity check on the deployment.

baumstern commented 1 year ago

Do you happen to have any more information on why the deployment failed? BTW, yes it would be great to be able to have a sanity check on the deployment.

GCP API rate-limit would be possible cause. They recommend to group functions 10 or fewer:

When deploying large numbers of functions, you may exceed the standard quota and receive HTTP 429 or 500 error messages. To solve this, deploy functions in groups of 10 or fewer.

https://firebase.google.com/docs/functions/manage-functions

baumstern commented 1 year ago

I investigated further and it turns out the deployment had been failed due to GCP side bug. A required permission to deploy Cloud Functions had not provisioned in-time(too late):

"Unable to retrieve the repository metadata for projects/redacted/locations/us-central1/repositories/gcf-artifacts. Ensure that the Cloud Functions service account has 'artifactregistry.repositories.list' and 'artifactregistry.repositories.get' permissions. You can add the permissions by granting the role 'roles/artifactregistry.reader'."

Retried to execute firebase deploy --only functions passed those permission check and deployment had succeed. Filed an issue addressing this problem to related Firebase repo: https://github.com/firebase/firebase-tools/issues/5244

0xjei commented 1 year ago

GCP API rate-limit would be possible cause. They recommend to group functions 10 or fewer:

I really like this one. Actually, we are "grouping" related functions in the same file, but I wasn't aware about this issue and this grouping feature.

I investigated further and it turns out the deployment had been failed due to GCP side bug. A required permission to deploy Cloud Functions had not provisioned in-time(too late):

Okay, I got it. This is evident from the initial deployment log, it activates the services but does not do in time. Thank you for opening an issue on that. I'll keep an eye on it :)

0xjei commented 1 year ago

Did we manage to resolve the issue? @gurrpi I noticed that they closed the issue that you had opened from the Firebase team.

https://github.com/firebase/firebase-tools/issues/5244

baumstern commented 1 year ago

Did we manage to resolve the issue? @gurrpi I noticed that they closed the issue that you had opened from the Firebase team.

firebase/firebase-tools#5244

I think it is possible that it could still happen. What do you think would be the best way to resolve this issue?

0xjei commented 1 year ago

Did we manage to resolve the issue? @gurrpi I noticed that they closed the issue that you had opened from the Firebase team. firebase/firebase-tools#5244

I think it is possible that it could still happen. What do you think would be the best way to resolve this issue?

From what the Firebase team says the main problem would seem to be that "it takes time for GCP IAM permissions grants to propagate." I was thinking that in that case, there would be no game changer solution other than to specify in the Coordinator Guide this limitation and ask, in case it occurs, to try again.

An alternative would be to investigate whether or not it is possible to enable these services before deployment so that they are already propagated when we run it. It turns out to be a minor problem right now, but let's leave these issue open and iterate on it later.

@gurrpi Any further thoughts on this?