ammaraziz / ctgap

Chlamydia trachomatis Genome Assembly Pipeline
3 stars 2 forks source link

create plurality consensus from 24 reference genomes #12

Closed ammaraziz closed 6 months ago

ammaraziz commented 6 months ago

To create the plurality consensus:

  1. Reference genomes were oriented with dnaapler
    for f  in *.fasta; do echo $f; dnaapler all -i $f -o plurality/${f/.fasta/.reorient} -t 6; done
  2. Aligned with all-to-all mugsy default settings on the output of dnaapler
    mugsy -p mugsyout *.fasta
  3. Each region (separated by =) is extracted manually
  4. For each region goalign consensus is run:
    goalign consensus --ignore-gaps -i input.fasta -o output.cons.fasta
  5. Concat all cons.fa sequences and rename:
    cat *.cons.fa | seqkit replace -p "$" -r "_{nr}" > plurality_all.fasta
  6. All plurality consensus was scaffolded against Ct Genotype D using ragtag:
    ragtag.py scaffold -u scaffoldReference.fasta plurality_all.fasta -o plurality_1_scaffold
  7. Output is renamed again:
seqkit replace -p "(.*)" -r "plurality_{nr}" plurality_scaffold/ragtag.scaffold.fasta > plurality_final.fasta

plurality_final.txt

ammaraziz commented 6 months ago

all.reoriented.txt