ammaraziz commented 8 months ago

[x] add second scrubby step to keep chlamydia reads only
[ ] consensus sequences:
- [x] use ragtag for scaffolding
- [ ] separate genome from plasmid
- [ ] collate genomes in single output folder
- [ ] collate plasmids in single output folder
- [ ] add sample name to fasta headers
[x] add mlst typing for two existing schemes
[x] remove bbnorm as shovill can downsample
[x] collate blast results genotype
[x] collate coverage
[x] use config to specify cpus used

ammaraziz commented 8 months ago

@gokeson for extracting chlamydia reads, which taxonomic level should we use? https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=813

Chlamydiota Chlamydiia Chlamydiales Chlamydiaceae Chlamydia/Chlamydophila group Chlamydia

I usually select a rank a few levels above species to make sure we capture as many isolate reads as possible.

gokeson commented 8 months ago

Chlamydiales is two ranks above chlamydia genus. Here is the flow: Chlamydiales -> Chlamydiaceae -> Chlamydia trachomatis

gokeson commented 8 months ago

A suggestion for phylogeny: we can collate all the de novo fasta files into one folder and then run ksnp from there. I am currently testing ksnp here

gokeson commented 8 months ago

some quick fixes:

rule index: prefix = "resources/ctReferences"

from "ctReference" to "ctReferences"

scaffold.yaml

dependencies:

ragtag
from "rag-tag" to "ragtag"

gokeson commented 8 months ago

error:

RuleException in rule scrub in file /home/oolago2/assembly_pipeline/fullTest_Dec23/CtGAP_test/ctgap/workflow/rules/2-scrub.smk, line 1: AttributeError: 'OutputFiles' object has no attribute 'r1tmp', when formatting the following:

    scrubby scrub-reads     -i {input.r1} {input.r2}        -o {output.r1tmp} {output.r2tmp}        --kraken-db {params.db}         --kraken-taxa "Archaea Eukaryota Holozoa Nucletmycea"   --min-len {params.minlen}       --minimap2-index {params.human}         --kraken-threads {threads}      --workdir {params.workdir:q} 2> {log}

    echo -e "

Scrubby Kraken Extract " >> {log}

    scrubby scrub-kraken    -i {output.r1tmp} {output.r2tmp}        -o {output.r1} {output.r2}      --extract       --kraken-taxa {params.kraken_taxa_extract}      --kraken-reads {params.workdir}/0-standardDB.kraken     --kraken-report {params.workdir}/0-standardDB.report    --kraken-threads {threads} 2>> {log}

    touch {output.status}

gokeson commented 8 months ago

separate genome from plasmid

I think this is not a top priority for now. Unless you've done it already.

gokeson commented 8 months ago

error:

RuleException in rule scrub in file /home/oolago2/assembly_pipeline/fullTest_Dec23/CtGAP_test/ctgap/workflow/rules/2-scrub.smk, line 1: AttributeError: 'OutputFiles' object has no attribute 'r1tmp', when formatting the following:

    scrubby scrub-reads     -i {input.r1} {input.r2}        -o {output.r1tmp} {output.r2tmp}        --kraken-db {params.db}         --kraken-taxa "Archaea Eukaryota Holozoa Nucletmycea"   --min-len {params.minlen}       --minimap2-index {params.human}         --kraken-threads {threads}      --workdir {params.workdir:q} 2> {log}

    echo -e "

Scrubby Kraken Extract " >> {log}

    scrubby scrub-kraken    -i {output.r1tmp} {output.r2tmp}        -o {output.r1} {output.r2}      --extract       --kraken-taxa {params.kraken_taxa_extract}      --kraken-reads {params.workdir}/0-standardDB.kraken     --kraken-report {params.workdir}/0-standardDB.report    --kraken-threads {threads} 2>> {log}

    touch {output.status}

this error persists:

RuleException in rule scrub in file /home/oolago2/assembly_pipeline/fullTest_Dec23/CtGAP_test/ctgap/workflow/rules/2-scrub.smk, line 1: AttributeError: 'OutputFiles' object has no attribute 'r1tmp', when formatting the following:

    scrubby scrub-reads     -i {input.r1} {input.r2}        -o {params.r1tmp} {params.r2tmp}     --kraken-db {params.db}         --kraken-taxa "Archaea Eukaryota Holozoa Nucletmycea"        --min-len {params.minlen}       --minimap2-index {params.human}      --kraken-threads {threads}      --workdir {params.workdir:q} 2> {log}

    echo -e "

Scrubby Kraken Extract " >> {log}

    scrubby scrub-kraken    -i {output.r1tmp} {output.r2tmp}        -o {output.r1} {output.r2}   --extract       --kraken-taxa {params.kraken_taxa_extract}      --kraken-reads {params.workdir}/0-standardDB.kraken  --kraken-report {params.workdir}/0-standardDB.report         --kraken-threads {threads} 2>> {log}

    touch {output.status}

ammaraziz / ctgap

v0.3.0 #2

from "ctReference" to "ctReferences"

from "rag-tag" to "ragtag"