Current Behavior:
When TNU is run to copy/download DRS data in a project-per-wokspace (PPW) workspace (any workspace created after 9/27/2021) the operation produces the following warning and often fails with the following error:
drs.copy("drs://dg.4503/93f98458-e816-4e56-9bea-013dc6c0ea4b", ".")
2021-11-01 02:53:26::INFO Enabling requester pays for your workspace. This will only take a few seconds...
2021-11-01 02:53:26::WARNING Failed to init requester pays for workspace terra-f20dfb56/mbaumann_tmp_test_tnu_v0_8_2 20211031: Expected '204', got '404' for 'https://rawls.dsde-prod.broadinstitute.org/api/workspaces/terra-f20dfb56/mbaumann_tmp_test_tnu_v0_8_2%2020211031/enableRequesterPaysForLinkedServiceAccounts'. You will not be able to access DRS URIs that interact with requester pays buckets.
2021-11-01 02:53:27::ERROR copy failed: 'gs://nih-nhlbi-biodata-catalyst-tutorial-genome-data/GWAS/1kg-genotypes/gds_maf001/ALL.chr8.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.bi_maf001.vcf.bgz.gds' to '/home/jupyter/notebooks/mbaumann_tmp_test_tnu_v0_8_2 20211031/edit/ALL.chr8.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.bi_maf001.vcf.bgz.gds'
Traceback (most recent call last):
File "/home/jupyter/notebooks/packages/terra_notebook_utils/blobstore/copy_client.py", line 69, in _do_copy
_download(src_blob, dst_blob, indicator_type)
File "/home/jupyter/notebooks/packages/terra_notebook_utils/blobstore/copy_client.py", line 27, in _download
with Indicator.get(indicator_type, dst_blob.url, src_blob.size()) as indicator:
File "/home/jupyter/notebooks/packages/terra_notebook_utils/blobstore/gs.py", line 162, in size
return self._get_native_blob().size
File "/home/jupyter/notebooks/packages/terra_notebook_utils/blobstore/gs.py", line 90, in _get_native_blob
return _get_native_blob(self._gs_bucket, self.key, self.credentials, self.billing_project)
File "/home/jupyter/notebooks/packages/terra_notebook_utils/blobstore/gs.py", line 48, in _get_native_blob
blob = bucket.get_blob(key)
File "/home/jupyter/notebooks/packages/google/cloud/storage/bucket.py", line 1214, in get_blob
retry=retry,
File "/home/jupyter/notebooks/packages/google/cloud/storage/_helpers.py", line 225, in reload
retry=retry,
File "/home/jupyter/notebooks/packages/google/cloud/storage/_http.py", line 78, in api_request
return call()
File "/home/jupyter/notebooks/packages/google/api_core/retry.py", line 291, in retry_wrapped_func
on_error=on_error,
File "/home/jupyter/notebooks/packages/google/api_core/retry.py", line 189, in retry_target
return target()
File "/home/jupyter/notebooks/packages/google/cloud/_http.py", line 484, in api_request
raise exceptions.from_http_response(response)
google.api_core.exceptions.Forbidden: 403 GET https://storage.googleapis.com/storage/v1/b/nih-nhlbi-biodata-catalyst-tutorial-genome-data/o/GWAS%2F1kg-genotypes%2Fgds_maf001%2FALL.chr8.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.bi_maf001.vcf.bgz.gds?userProject=terra-f20dfb56&projection=noAcl&prettyPrint=false: jcjfvchhjamh0xv8f62efl4pn-989@dcpstage-210518.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project.
This happens with both the TNU DRS API and CLI.
The problem exists in TNU v0.8.2 and all prior versions.
Expected Behavior
TNU DRS operations should work successfully in PPW workspaces, just as they do in non-PPW workspaces (workspaces created before 9/27/2021)
Root Cause
The root cause is that the TNU DRS subcommands use the GOOGLE_PROJECT environment available to determine the "workspace namespace", which is used to call the Rawls method enableRequesterPaysForLinkedServiceAccounts. The Rawls workspace operations are written to take the Terra billing project as the workspace namespace. Before PPW, the workspace namespace and the GOOGLE_PROJECT values were the same, with PPW workspaces they are different. In the error output above, it may be seen that terra-f20dfb56 is being passed to Rawls as the workspace namespace, when in this PPW workspace, the Terra billing project name is anvil-stage-demo, and that is the value that should be passed to Rawls.
Partial Workaround
For TNU CLI use, this can be worked around by using tnu config set-workspace-namespace to set the value to the name of the Terra billing project.
For TNU API use, I haven't yet found a way to work around this based on my tests to date. I haven't tried all possibilities. A review of the code may reveal a way to workaround this from the API also, I don't yet know.
How to Fix
TNU should use the WORKSPACE_NAMESPACE environment variable instead of the GOOGLE_PROJECT environment variable to work properly with PPW workspaces.
Current Behavior: When TNU is run to copy/download DRS data in a project-per-wokspace (PPW) workspace (any workspace created after 9/27/2021) the operation produces the following warning and often fails with the following error:
This happens with both the TNU DRS API and CLI.
The problem exists in TNU v0.8.2 and all prior versions.
Expected Behavior TNU DRS operations should work successfully in PPW workspaces, just as they do in non-PPW workspaces (workspaces created before 9/27/2021)
Root Cause The root cause is that the TNU DRS subcommands use the
GOOGLE_PROJECT
environment available to determine the "workspace namespace", which is used to call the Rawls methodenableRequesterPaysForLinkedServiceAccounts
. The Rawls workspace operations are written to take the Terra billing project as the workspace namespace. Before PPW, the workspace namespace and the GOOGLE_PROJECT values were the same, with PPW workspaces they are different. In the error output above, it may be seen thatterra-f20dfb56
is being passed to Rawls as the workspace namespace, when in this PPW workspace, the Terra billing project name isanvil-stage-demo
, and that is the value that should be passed to Rawls.Partial Workaround For TNU CLI use, this can be worked around by using
tnu config set-workspace-namespace
to set the value to the name of the Terra billing project.For TNU API use, I haven't yet found a way to work around this based on my tests to date. I haven't tried all possibilities. A review of the code may reveal a way to workaround this from the API also, I don't yet know.
How to Fix TNU should use the
WORKSPACE_NAMESPACE
environment variable instead of theGOOGLE_PROJECT
environment variable to work properly with PPW workspaces.How to Reproduce
tnu drs copy
operation