Cellranger outputs a lot of metric files that are used for quality control. I have written scripts that scrape these files into multiqc interpreted files.
Problem/Issue
My code sucks and has a lot of bugs.
What have I done
Introduces stub data in order to test the pipeline more thoroughly locally. This showed many bugs. I also rewrote the count2multiqc partly to python for readability.
Check list
I have:
[x] Tested with Stub run
[x] Tested on our HPC
Command + output
(base) ➜ singleCellWorkflows git:(fix-flex-mqc-yaml) nextflow run main.nf --samplesheet /Users/jacobkarlstrom/Projects/singleCellWorkflows/examples/CTG_SampleSheet.csv \
--outdir /Users/jacobkarlstrom/Projects/singleCellWorkflows/examples/output \
-profile local_dev \
-stub-run --nextflow_log $(readlink -f nextflow.log)
[55/d79cb8] process > GET_ANALYSISES [100%] 1 of 1 ✔
[3e/79423b] process > SCRNASEQ:SPLITSHEET [100%] 1 of 1 ✔
[ca/e5640a] process > SCRNASEQ:FASTQC (sample-3) [100%] 3 of 3 ✔
[8c/616ab7] process > SCRNASEQ:COUNT (sample-2) [100%] 3 of 3 ✔
[2b/0b67c5] process > SCRNASEQ:FINISH_PROJECTS:MULTIQC (project1) [100%] 2 of 2 ✔
[af/5ea095] process > SCRNASEQ:FINISH_PROJECTS:SYNC_MULTIQC (project1) [100%] 2 of 2 ✔
[e0/e6cdb6] process > SCRNASEQ:FINISH_PROJECTS:PACK_WEBSUMMARIES (project1) [100%] 2 of 2 ✔
[b5/330a07] process > SCRNASEQ:FINISH_PROJECTS:PUBLISH_MANIFEST (project1) [100%] 2 of 2 ✔
[ba/41637a] process > SCRNASEQ:FINISH_PROJECTS:MD5SUM (project1) [100%] 2 of 2 ✔
[47/197cac] process > SCRNASEQ:FINISH_PROJECTS:DELIVER_PROJ (project1) [100%] 2 of 2 ✔
[b4/24e74d] process > FLEX_SCRNASEQ:SPLITSHEET [100%] 1 of 1 ✔
[ae/ff2053] process > FLEX_SCRNASEQ:FASTQC (sample-7) [100%] 3 of 3 ✔
[e5/73f1b1] process > FLEX_SCRNASEQ:SPLIT_MULTIPLEX_SHEET (3) [100%] 3 of 3 ✔
[3a/212b4d] process > FLEX_SCRNASEQ:GEN_FLEX_CONFIG (2) [100%] 3 of 3 ✔
[2f/821faa] process > FLEX_SCRNASEQ:MULTI (sample-7) [100%] 3 of 3 ✔
[71/e85aff] process > FLEX_SCRNASEQ:CELLRANGER_MULTI_TO_MULTIQC (2) [100%] 2 of 2 ✔
[39/831ee2] process > FLEX_SCRNASEQ:FINISH_PROJECTS:MULTIQC (project4) [100%] 2 of 2 ✔
[9b/90a5dc] process > FLEX_SCRNASEQ:FINISH_PROJECTS:SYNC_MULTIQC (project4) [100%] 2 of 2 ✔
[a0/a42000] process > FLEX_SCRNASEQ:FINISH_PROJECTS:PACK_WEBSUMMARIES (project4) [100%] 2 of 2 ✔
[59/6284db] process > FLEX_SCRNASEQ:FINISH_PROJECTS:PUBLISH_MANIFEST (project4) [100%] 2 of 2 ✔
[3d/c3e865] process > FLEX_SCRNASEQ:FINISH_PROJECTS:MD5SUM (project4) [100%] 2 of 2 ✔
[90/512862] process > FLEX_SCRNASEQ:FINISH_PROJECTS:DELIVER_PROJ (project5) [100%] 2 of 2 ✔
[db/62a278] process > SCCITESEQ:SPLITSHEET [100%] 1 of 1 ✔
[01/49539a] process > SCCITESEQ:FASTQC (sample-6) [100%] 3 of 3 ✔
[de/a9bc98] process > SCCITESEQ:GENERATE_LIB_CSV (1) [100%] 1 of 1 ✔
[c7/0e9c3d] process > SCCITESEQ:FILTER_FEATURE_REFERENCE (sample-4) [100%] 1 of 1 ✔
[30/b63b8f] process > SCCITESEQ:COUNT (sample-4) [100%] 1 of 1 ✔
[b3/172856] process > SCCITESEQ:CELLRANGER_COUNT_TO_MULTIQC [100%] 1 of 1 ✔
[29/75bc74] process > SCCITESEQ:FINISH_PROJECTS:MULTIQC (project3) [100%] 1 of 1 ✔
[04/41a044] process > SCCITESEQ:FINISH_PROJECTS:SYNC_MULTIQC (project3) [100%] 1 of 1 ✔
[9b/fe6b9f] process > SCCITESEQ:FINISH_PROJECTS:PACK_WEBSUMMARIES (project3) [100%] 1 of 1 ✔
[6a/a91922] process > SCCITESEQ:FINISH_PROJECTS:PUBLISH_MANIFEST (project3) [100%] 1 of 1 ✔
[79/b18a6b] process > SCCITESEQ:FINISH_PROJECTS:MD5SUM (project3) [100%] 1 of 1 ✔
[63/ad5748] process > SCCITESEQ:FINISH_PROJECTS:DELIVER_PROJ (project3) [100%] 1 of 1 ✔
[5b/604dc4] process > SC_ATAC:SPLITSHEET [100%] 1 of 1 ✔
[35/471a33] process > SC_ATAC:FASTQC (sample-15) [100%] 1 of 1 ✔
[1c/f3d0a0] process > SC_ATAC:COUNT_ATAC (sample-15) [100%] 1 of 1 ✔
[3e/16a4b6] process > SC_ATAC:FINISH_PROJECTS:MULTIQC (project8) [100%] 1 of 1 ✔
[d6/d1b428] process > SC_ATAC:FINISH_PROJECTS:SYNC_MULTIQC (project8) [100%] 1 of 1 ✔
[98/92b662] process > SC_ATAC:FINISH_PROJECTS:PACK_WEBSUMMARIES (project8) [100%] 1 of 1 ✔
[79/cf2675] process > SC_ATAC:FINISH_PROJECTS:PUBLISH_MANIFEST (project8) [100%] 1 of 1 ✔
[e0/ffbaff] process > SC_ATAC:FINISH_PROJECTS:MD5SUM (project8) [100%] 1 of 1 ✔
[81/5e825c] process > SC_ATAC:FINISH_PROJECTS:DELIVER_PROJ (project8) [100%] 1 of 1 ✔
[c4/065f66] process > SC_ARC:SPLITSHEET [100%] 1 of 1 ✔
[13/903d92] process > SC_ARC:FASTQC (sample-22) [100%] 6 of 6 ✔
[ad/a8c87f] process > SC_ARC:GENERATE_LIB_CSV (2) [100%] 3 of 3 ✔
[33/001125] process > SC_ARC:COUNT_ARC (sample-17) [100%] 3 of 3 ✔
[3d/8636b4] process > SC_ARC:CELLRANGER_COUNT_TO_MULTIQC [100%] 1 of 1 ✔
[18/9638d9] process > SC_ARC:FINISH_PROJECTS:MULTIQC (project6) [100%] 2 of 2 ✔
[29/0b5826] process > SC_ARC:FINISH_PROJECTS:SYNC_MULTIQC (project6) [100%] 2 of 2 ✔
[91/06e891] process > SC_ARC:FINISH_PROJECTS:PACK_WEBSUMMARIES (project6) [100%] 2 of 2 ✔
[c0/770de1] process > SC_ARC:FINISH_PROJECTS:PUBLISH_MANIFEST (project6) [100%] 2 of 2 ✔
[03/db78c5] process > SC_ARC:FINISH_PROJECTS:MD5SUM (project6) [100%] 2 of 2 ✔
[be/378fcc] process > SC_ARC:FINISH_PROJECTS:DELIVER_PROJ (project6) [100%] 2 of 2 ✔
[c1/b0e21d] process > SCMULTI:SPLITSHEET [100%] 1 of 1 ✔
[2f/4d79d8] process > SCMULTI:FASTQC (sample-21) [100%] 6 of 6 ✔
[be/a0500f] process > SCMULTI:GENERATE_MULTI_CONFIG (1) [100%] 2 of 2 ✔
[f1/b1e0be] process > SCMULTI:MULTI (sample-19) [100%] 2 of 2 ✔
[95/c8cd6d] process > SCMULTI:CELLRANGER_MULTI_TO_MULTIQC (1) [100%] 1 of 1 ✔
[1d/b78b53] process > SCMULTI:FINISH_PROJECTS:MULTIQC (project7) [100%] 1 of 1 ✔
[d5/2b7015] process > SCMULTI:FINISH_PROJECTS:SYNC_MULTIQC (project7) [100%] 1 of 1 ✔
[4d/edb297] process > SCMULTI:FINISH_PROJECTS:PACK_WEBSUMMARIES (project7) [100%] 1 of 1 ✔
[57/451484] process > SCMULTI:FINISH_PROJECTS:PUBLISH_MANIFEST (project7) [100%] 1 of 1 ✔
[13/f84ad7] process > SCMULTI:FINISH_PROJECTS:MD5SUM (project7) [100%] 1 of 1 ✔
[a4/1de1fe] process > SCMULTI:FINISH_PROJECTS:DELIVER_PROJ (project7) [100%] 1 of 1 ✔
[16/e4c292] process > VISIUM:SPLITSHEET [100%] 1 of 1 ✔
[4e/3eabe5] process > VISIUM:FASTQC (sample-18) [100%] 1 of 1 ✔
[e2/255cb0] process > VISIUM:SPACECOUNT (1) [100%] 1 of 1 ✔
[a2/7deeff] process > VISIUM:CELLRANGER_COUNT_TO_MULTIQC [100%] 1 of 1 ✔
[53/51bf45] process > VISIUM:FINISH_PROJECTS:MULTIQC (project10) [100%] 1 of 1 ✔
[38/adce6f] process > VISIUM:FINISH_PROJECTS:SYNC_MULTIQC (project10) [100%] 1 of 1 ✔
[2b/c3ca94] process > VISIUM:FINISH_PROJECTS:PACK_WEBSUMMARIES (project10) [100%] 1 of 1 ✔
[58/00ced8] process > VISIUM:FINISH_PROJECTS:PUBLISH_MANIFEST (project10) [100%] 1 of 1 ✔
[24/6a2ab0] process > VISIUM:FINISH_PROJECTS:MD5SUM (project10) [100%] 1 of 1 ✔
[95/9ce69c] process > VISIUM:FINISH_PROJECTS:DELIVER_PROJ (project10) [100%] 1 of 1 ✔
Background
Cellranger outputs a lot of metric files that are used for quality control. I have written scripts that scrape these files into multiqc interpreted files.
Problem/Issue
My code sucks and has a lot of bugs.
What have I done
Introduces stub data in order to test the pipeline more thoroughly locally. This showed many bugs. I also rewrote the count2multiqc partly to python for readability.
Check list
I have:
Command + output
Multiqc output