Closed oliverdrechsel closed 3 years ago
I think having a multi FASTA at 2.Genomes/all-consensus-sequences.fasta
that combines all the single FASTAs in that folder into one file should do the job? Of course, this might then still include reconstructed consensus that fail a later QC: but this can be checked in the report
i personally would object multi fasta files as they expect that all sequencing data are delivered to the same target folder. They are much harder to split (recreating meaningful names) than single files are to fuse.
hi @oliverdrechsel
each samples consensus fasta file is located in this folder:
./<outputdirectory>/2.Genomes/<sample_name>/<samplename>_consensus.fasta
<outputdirectory>
can be changed via --output
flag. samplename is usually "barcode01" etc. if you start from basecalling
Multifastafiles (with QC passing) are only collected via the optional --rki
flag. maybe I misunderstood the question?
i personally would object multi fasta files as they expect that all sequencing data are delivered to the same target folder. They are much harder to split (recreating meaningful names) than single files are to fuse.
Ah, okay now I get what you mean @oliverdrechsel . You just want to have all the single FASTAs (one per sample) in one single output folder, right? Instead of sub-folders like described by @replikation:
./<outputdirectory>/2.Genomes/<sample_name>/<samplename>_consensus.fasta
so something like:
./<outputdirectory>/2.Genomes/all/<samplename>_consensus.fasta
?
It's a minor thing but maybe we can simply publish the FASTAs also to
./<outputdirectory>/2.Genomes/all_consensus/<samplename>_consensus.fasta
Thus, we would have the per-sample folder structure to check for details (VCF, BAM, FASTA, PDF, ...) and another folder that just has all the FASTAs.
or do you want an additional output folder for all the consensuses that is outside of the
./<outputdirectory>/
structure? This would need an additional parameter e.g.
--output_consensus /some/other/path/tp/write/all/consensus/fasta
Hi @hoelzer
something like ./<outputdirectory>/2.Genomes/all/<samplename>_consensus.fasta
would be fine, i think.
It would be easier to distribute the data to somewhere else, if one just has to visit one folder and not iterate through various folders to get all output data.
Hi @hoelzer something like
./<outputdirectory>/2.Genomes/all/<samplename>_consensus.fasta
would be fine, i think. It would be easier to distribute the data to somewhere else, if one just has to visit one folder and not iterate through various folders to get all output data.
Okay, I think this should be doable with e.g. an optional --collect <outputdirectory>
flag
Hi,
would it be possible that the final consensus sequences would be put/linked into an output folder? Anything like 'consensus_sequences' ? This would facilitate copying out all consensus sequences for further use outside of the pipeline.