DataBiosphere / terra-notebook-utils

Utilities for the Terra notebook environment.
MIT License
7 stars 6 forks source link

Only enable requester pays when necessary #400

Open mbaumann-broad opened 1 year ago

mbaumann-broad commented 1 year ago

Objective

Only perform the operation to enable requester pays when it is currently necessary.
The only case where it is currently necessary is when accessing AnVIL data hosted by Gen3.
This case can be identified by DRS URIs starting with "drs://dg.ANV0".
In all other cases, enabling requester pays adds unnecessary delays and, in some cases, as when a Terra user has an external identity linkage to Kids First, spurious errors.

Background

Accessing DRS data hosted by Gen3 in requester pays buckets using Google cloud-native access APIs required that the Gen3 user client service account be granted the serviceusage.services.use permission on the Terra user's Google project.

This privileged operation is performed by the Rawls API enableRequesterPaysForLinkedServiceAccounts.

Originally, this was needed for Terra DRS access to Gen3-hosted data for both Biodata Catalyst and AnVIL. Terra DRS access to Gen3-hosted Biodata Catalyst data has since changed to using signed URLs, and enabling requester pays is no longer necessary for Biodata Catalyst. Terra DRS access to Gen3-hosted data for NCI CRDC and Kids First has never been necessary.

When AnVIL data is fully hosted in Terra Data Repo (TDR), enabling requester pays will no longer be needed at all and should be disabled completely.

Acceptance Criteria