NCI-DCEG / Flow-IQ

A workflow cloud migration toolkit
0 stars 0 forks source link

Import and run Sarek on the curated dataset on Biowulf and AWS HealthOmics #2

Open shukwong opened 1 month ago

shukwong commented 1 month ago

Import and run Sarek on the curated dataset on Biowulf and AWS HealthOmics Test for performance on both platforms

jaamarks commented 1 week ago

In the nf-core/sarek benchmarking section, it describes:

On each release, the pipeline is run on 3 full size tests:

  • test_full runs tumor-normal data for one patient from the SEQ2C consortium
  • test_full_germline runs a WGS 30X Genome-in-a-Bottle(NA12878) dataset
  • test_full_germline_ncbench_agilent runs two WES samples with 75M and 200M reads (data available here). The results are uploaded to Zenodo, evaluated against a truth dataset, and results are made available via the NCBench dashboard.

So we can likely utilize these datasets for testing.