ga4gh / cloud-interop-testing

Interoperable execution of workflows using GA4GH APIs
Apache License 2.0
9 stars 8 forks source link

Add WES and Discovery Search to FASP Python examples #108

Open ianfore opened 4 years ago

ianfore commented 4 years ago

https://github.com/ianfore/FASPclient has example python code to

  1. query data
  2. obtain urls via DRS
  3. run a compute It currently queries BigQuery directly and submits a pipelines directly to GCP Life Sciences pipeline. We want to convert the script so steps 1 and 3 use the equivalent GA4GH APIs.

Creating issue here to track the following

ianfore commented 4 years ago

8/11 session WES Worked through submitting a WES job (MD5 checksum) on a file specified as a URL obtained from DRS. See checksum.wdl in FASPclient. For debugging purposes we ended up doing this with a relatively small file. Submitted everything via postman. Discovery Identified 1000 Genomes views in BigQuery which link subject and specimen data with ids for .bam files. These work as DRS ids for BioDataCatalyst DRS Server. The views were created as queries on a table which is an import of PFB (Avro) from BDC. Due to Presto's preferences for working with views created a table to be used for Discovery Search.

Additional tasks based on 8/11 session

ianfore commented 4 years ago

Created onek_genomes dataset with lower-cased name per Jonathan's request. Presto needs lower case names. The ssd_drs table is also lower-cased. Granted BigQuery Viewer role on the dataset to the DNAStask service accounts used for Presto.