firebase / firebase-tools

The Firebase Command Line Tools
MIT License
3.97k stars 917 forks source link

Error Adjusting Traffic to Previous Cloud Run Revisions Deployed via Firebase CLI(Cloud Functions) #6759

Open maylorsan opened 4 months ago

maylorsan commented 4 months ago

Environment info

firebase-tools: 13.2.1 Platform: macOS 14.0 (23A344)

Test case

A Cloud Run service deployed via Cloud Functions, where deployments are triggered through the Firebase CLI. The service experiences failures when attempting to route traffic back to previous revisions, receiving errors related to fetching container image metadata from Google Artifact Registry.

Steps to reproduce

  1. Deploy Cloud Functions that trigger updates to a Cloud Run service using the Firebase CLI with the following command:
firebase deploy
  1. Confirm the deployment is successful and the new revision is serving traffic.
  2. Attempt to adjust traffic back to a previous revision using either the Google Cloud Console or the gcloud CLI:
    gcloud run services update-traffic "SERVICE_NAME" --region "europe-central2" --to-revisions "PREVIOUS_REVISION=100"
  3. Observe the error message indicating the failure due to the inability to fetch metadata for the container image from the Artifact Registry.

Expected behavior

It should be possible to adjust traffic to any revision of the Cloud Run service deployed via Firebase CLI without encountering errors related to container image imports. Adjusting traffic should be seamless across revisions, allowing for flexible traffic management and rollback capabilities.

Actual behavior

When trying to adjust traffic to a previous revision of a Cloud Run service deployed via Firebase CLI, the operation fails with an error related to fetching metadata for the container image from Google Artifact Registry. The error specifically states that the container image cannot be imported because the metadata cannot be fetched, implying that the image or its manifest could not be found.

Error Message: Container import failed: APPLICATION_ERROR;riptide/Riptide.PullPod;failed to fetch metadata: generic::not_found: failed to fetch metadata from the registry for image "europe-central2-docker.pkg.dev/PROJECT_ID/gcf-artifacts/firebase_cli_function@sha256:1452c3ab09cf77841c40711356672cb3425409b7b95e2d46a850554e4f1ef28f", with reason: generic::not_found: fetchImageMetadata from europe-central2-docker.pkg.dev failed for image europe-central2-docker.pkg.dev/PROJECT_ID/gcf-artifacts/firebase_cli_function@sha256:1452c3ab09cf77841c40711356672cb3425409b7b95e2d46a850554e4f1ef28f, reason: generic::not_found: failed to fetch manifest: generic::not_found: failed to fetch manifest "PROJECT_ID/gcf-artifacts/firebase_cli_function/manifests/sha256:1452c3ab09cf77841c40711356672cb3425409b7b95e2d46a850554e4f1ef28f", error: generic::not_found: got HTTP/404 response for URI https://europe-central2-docker.pkg.dev/v2/PROJECT_ID/gcf-artifacts/firebase_cli_function/manifests/sha256:1452c3ab09cf77841c40711356672cb3425409b7b95e2d46a850554e4f1ef28f: (allowRedirect=false);AppErrorCode=5;StartTimeMs=1707486177875;unknown;ResFormat=uncompressed;ServerTimeSec=0.016901819;LogBytes=256;Non-FailFast;EffSecLevel=none;ReqFormat=uncompressed;ReqID=8f9f4ed05985aa9c;GlobalID=0;Server=[2002:a17:93e:349:b0:41:5547:163f]:4001

This issue consistently manifests across all Cloud Run services when deployment and subsequent revisions are managed through Cloud Functions triggered by Firebase CLI deployments. Notably, this challenge does not extend to deployments executed directly through the gcloud CLI. Extensive testing with the gcloud CLI for deploying similar Cloud Run services and adjusting traffic between revisions yields the expected outcomes, with no errors encountered. This contrast underscores a potential discrepancy or bug specifically associated with the Firebase CLI's handling or configuration of Cloud Run service revisions, which does not seem to properly maintain or reference the container images for previous revisions in the Google Artifact Registry. The successful deployment and traffic adjustments using the gcloud CLI affirm that the underlying Cloud Run and Google Artifact Registry configurations are correct and suggest the issue is isolated to Firebase CLI deployments.

maylorsan commented 4 months ago

Upon further investigation, I've noticed a potentially significant difference in how firebase deploy and gcloud deploy manage Docker container images in the Artifact Registry. Specifically, when deploying Cloud Functions via firebase deploy, it appears that the associated Docker container image in the Artifact Registry is removed post-deployment.In contrast, deployments carried out using gcloud deploy do not exhibit this behavior; the Docker container images remain intact within the Artifact Registry post-deployment. image