Closed hannes-ucsc closed 2 years ago
@hannes-ucsc : "I pinged the TDR team on Slack 12/1/2021. Waiting for a response."
Unclear if this is planned or not and for when. Making this a blocker of DataBiosphere/Azul#3572.
Broad informed us that they have fixed the issue and deployed it to dev:
Confirmed that the element is present on dev
, although it is pluralized as "dataProjects" instead of the expected "dataProject".
$ curl -s "https://jade.datarepo-dev.broadinstitute.org/api/repository/v1/snapshots?direction=asc&limit=10&offset=0&sort=created_date" "authorization: Bearer $auth_token" | jq '.items[].dataProjects'
"broad-jade-dev-data"
"broad-jade-dev-data"
"broad-jade-dev-data"
"broad-jade-dev-data"
"broad-jade-dev-data"
"broad-jade-dev-data"
"broad-jade-dev-data"
"broad-jade-dev-data"
"broad-jade-dev-data"
"broad-jade-dev-data"
From Nicolas Malfroy-Camine: "Changes just went live on Prod (data.terra.bio)".
https://data.terra.bio is used by Azul's prod2
instance, but the changes are not observable on https://jade-terra.datarepo-prod.broadinstitute.org, which is used by Azul's prod
instance.
@hannes-ucsc :"Assuming that the Broad is not going to deploy this to TDR old prod, #3572 needs to wait until we switch to TDR new prod. We created #3782 and made it a blocker of #3572."
The response to a request to the
enumerateSnapshots
endpoint does not include the name of the Google project that hosts the BQ tables:The
retrieveSnapshot
response does (note thedataProject
property):Azul needs the Google Project name to compose BQ queries against the tables in a snapshot. We also prefer to use the
enumerateSnapshot
endpoint to efficiently get information about multiple snapshots at once but the lack of the Google Project lack in theenumerateSnapshot
response forces us also hit theretrieveSnapshot
endpoint for each snapshot individually. So we currently need to make N + 1 requests instead of N. This is aggravated by the fact that N is now large (>100) since we intend to create one snapshot per HCA project.It seems that it should be relatively easy to add the
dataProject
property to theenumerateSnapshot
response. Doing so would greatly reduce the latency of certain Azul requests, enhancing the overall user experience and reducing complexity in the Azul code base.