Arcadia-Science / metagenomics

A Nextflow workflow for QC, evaluation, and profiling of metagenomic samples using short- and long-read technologies
MIT License
36 stars 2 forks source link

Rename contigs after assembly #8

Closed elizabethmcd closed 1 year ago

elizabethmcd commented 1 year ago

This will be a bug later down the road if contig header names after > are not renamed to simplified names such as contig_00001 etc. and remain names that SPAdes gives them

elizabethmcd commented 1 year ago

This only needs to be done for metaspades since flye outputs sensible contig names. Unless want to append assemblyname_contig_01 for example or have a good amount of 000s so that the contigs are actually listed in order

elizabethmcd commented 1 year ago

For each assembly rename the contig with this structure:

For example: comm_1_metaspades_contig000001 so all relevant information is propagated into the assembly contigs

elizabethmcd commented 1 year ago

addressed in #45