icgc-argo / workflow-roadmap

Roadmap and management for genomic data processing
GNU Affero General Public License v3.0
1 stars 0 forks source link

DNA-Seq pipeline Dev - Alignment workflow in nf-core framework #376

Open edsu7 opened 1 year ago

edsu7 commented 1 year ago

Draft copy in: https://github.com/icgc-argo-workflows/dnaaln

Testing commands:

  1. Full pipeline work through if sequencing files and payload are local
    nextflow run main.nf -profile debug_qa,docker --api_token NNNNNNNN-NNNN-NNNN-NNNN-NNNNNNNNNNNN --local_sequencing_json test/data/local_sequencing.json --local_data_directory test/data --local true --reference_fasta test/reference/GRCh38_Verily_v1.fasta --tools index,aln,oxo_qc,aln_qc,rg_qc,up_qc,up_aln,cleanup
  2. Copying over the generated .fai,.dict,.0123,.amb,.ann,.bwt.2bit.64,.pac from step1 to test/data pipeline should be able to run with indexing step
    nextflow run main.nf -profile debug_qa,docker --api_token NNNNNNNN-NNNN-NNNN-NNNN-NNNNNNNNNNNN --local_sequencing_json test/data/local_sequencing.json --local_data_directory test/data --local true --reference_fasta test/reference/GRCh38_Verily_v1.fasta --tools aln,oxo_qc,aln_qc,rg_qc,up_qc,up_aln,cleanup
  3. Uploading alignment payload locally
    nextflow run main.nf -profile debug_qa,docker --api_token NNNNNNNN-NNNN-NNNN-NNNN-NNNNNNNNNNNN --local_alignment_json test/data/local_alignment.json --local_data_directory test/data --local true --reference_fasta test/reference/GRCh38_Verily_v1.fasta --tools up_aln
  4. Uploading QC payload locally
    nextflow run main.nf -profile debug_qa,docker --api_token NNNNNNNN-NNNN-NNNN-NNNN-NNNNNNNNNNNN--local_qc_json test/data/local_qc.json --local_data_directory test/data --local true --reference_fasta test/reference/GRCh38_Verily_v1.fasta --tools up_qc
  5. Alignment via download sequencing files from SONG/SCORE
    nextflow run main.nf -profile debug_qa,docker --api_token NNNNNNNN-NNNN-NNNN-NNNN-NNNNNNNNNNNN --analysis_id 026e7dbd-8a7b-4ee1-ae7d-bd8a7b0ee120 --study_id TEST-QA --tools aln,oxo_qc,aln_qc,rg_qc,up_qc,up_aln,cleanup --reference_fasta test/reference/GRCh38_Verily_v1.fasta

Testing notes:

edsu7 commented 1 year ago

Re-running alignments to benchmark results Analyses will be saved to QA

study_id sampleId submission analysisId qc_metrics analysisId sequencing_alignment analysisID
MUTO-INTL SA626533 4fe73219-8af7-4d61-a732-198af7dd6151 d267c1a5-2862-464c-a7c1-a52862464c69 28e10fbf-d687-45e7-a10f-bfd68775e7b3
MUTO-INTL SA626532 fee8e2e5-b646-4da3-a8e2-e5b6462da3a7 77ce5181-8395-49bb-8e51-81839519bb15 3f93bafc-affd-455f-93ba-fcaffd055f48
OCCAMS-GB SA607699 003c2b59-3ff3-4214-bc2b-593ff3121409 626caf7d-d5a3-4f37-acaf-7dd5a3ef3708 b8231347-2568-436f-a313-472568f36ff6
OCCAMS-GB SA607698 220fe31f-c59d-4773-8fe3-1fc59d577359 d7eefe63-9768-402d-aefe-639768c02dc4 67f53e8f-f666-4e49-b53e-8ff666be4973
APGI-AU SA410795 2c755f62-1929-41a6-b55f-62192991a6bb 6eed7544-c292-4560-ad75-44c292e56012 2aa2c0ce-bf88-47fc-a2c0-cebf8817fc62
APGI-AU SA410803 e2131db5-cf23-468f-931d-b5cf23568fe5 2a4682ff-ab3b-4844-8682-ffab3b884480 d4cc4774-061a-45a6-8c47-74061ad5a669
PACA-CA SA600959 2529fa5a-efd8-46ec-a9fa-5aefd856ec9c 3f2c443e-c890-4b8d-ac44-3ec890ab8dd1 7fe39382-28db-4680-a393-8228db26802d
PACA-CA SA600960 2f0df339-1182-4135-8df3-391182b135a0 8615330f-fb18-45ec-9533-0ffb1835ecf9 69d5e304-0a3e-45de-95e3-040a3e95de0c
LUCA-KR SA520548 02839a76-4c01-478e-839a-764c01f78e02 169e6343-b85d-4386-9e63-43b85d738611 f2e593e9-8bdb-4f0c-a593-e98bdb2f0c19
LUCA-KR SA520549 04729a68-0ad6-47f5-b29a-680ad627f53d 94707a01-9291-450e-b07a-019291c50e8c 5db628a5-4e2d-4be5-b628-a54e2d6be5ef
study_id sampleId BWA qc_metrics analysisId BWA sequencing_alignment analysisID BWAmem2 qc_metrics analysisId BWAmem2 sequencing_alignment analysisID
MUTO-INTL SA626532 23cc87b1-a9a3-4353-8c87-b1a9a3b35378 05ef738e-c1bd-47f0-af73-8ec1bd67f053 e565c6bf-1ea5-42d9-a5c6-bf1ea502d975 66af2bce-3c69-4c8d-af2b-ce3c693c8d0d
MUTO-INTL SA626533 20d7bf3c-4d4b-4a71-97bf-3c4d4b0a7167 4ea2f5e8-0405-42aa-a2f5-e8040572aabc be06b16f-f0e2-4536-86b1-6ff0e2c536d6 e9d43ba2-bcff-484f-943b-a2bcffd84fad
OCCAMS-GB SA607699 cb5d4a49-81de-4122-9d4a-4981deb12259 8bf37cb6-b6fe-4008-b37c-b6b6fe6008ad bfd4a626-9de2-490b-94a6-269de2990b98 affbf0b9-9499-428e-bbf0-b99499128e44
OCCAMS-GB SA607698 b09b2280-e8fc-47d4-9b22-80e8fc27d485 76ab9223-63e8-4252-ab92-2363e8125206 a145c809-74d4-406f-85c8-0974d4c06f56 f10d4a9c-94da-452a-8d4a-9c94da052a23
APGI-AU SA410795 1beb3c43-5d4e-43a6-ab3c-435d4ea3a694 0df8773f-b56d-4eeb-b877-3fb56dbeebfe 7d8c4e94-6069-4a1f-8c4e-946069fa1f9a 9e61db2d-c6c3-40a4-a1db-2dc6c3e0a496
APGI-AU SA410803 2da01616-c406-4d2c-a016-16c4062d2cca af100481-8119-4b8a-9004-818119fb8a43 f8fae855-ed46-47a1-bae8-55ed4667a1cd d20b0507-52ee-42c0-8b05-0752eea2c0a3
PACA-CA SA600959 5721b452-5caa-4dca-a1b4-525caacdca7b 3c23bc30-b2f4-4841-a3bc-30b2f43841e2 8529bdb7-148f-444c-a9bd-b7148f144c69 6141718f-488c-4940-8171-8f488c194069
PACA-CA SA600960 f48d238a-235a-4d95-8d23-8a235a1d9550 479cbb74-6392-472e-9cbb-746392472e11 f66e5338-03ae-40fc-ae53-3803ae70fc05 0e2ca59a-40ee-48e7-aca5-9a40ee68e7f6
LUCA-KR SA520548 8d5c5160-63a3-488c-9c51-6063a3b88c0a decdb673-60c8-4ff6-8db6-7360c86ff671 fdfa5730-4c53-4ac8-ba57-304c538ac8b0 b016acff-7874-405a-96ac-ff7874405abc
LUCA-KR SA520549 53413615-8218-4448-8136-158218f44849 f95e8d71-5244-4f7e-9e8d-715244af7e52 5a4cd61e-b112-402e-8cd6-1eb112f02e00 318e6eca-b1e3-4489-8e6e-cab1e34489e7
edsu7 commented 11 months ago

@edsu7 to perform follow up comparison and analyses

edsu7 commented 9 months ago

Comparison blocked b/c of score issue: https://github.com/overture-stack/score/issues/385

justincorrigible commented 9 months ago

Blocker resolved with release of Score v5.10.0; ticket may proceed

edsu7 commented 2 months ago

Rerunning benchmarking on 3 MUTO-INTL for BWAMEM2

edsu7 commented 4 weeks ago

Benchmarking power point ready.

Testing in QA delayed due to : https://github.com/icgc-argo/workflow-roadmap/issues/452