marbl / parsnp

Parsnp was designed to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours. Input can be both draft assemblies and finished genomes, and output includes variant (SNP) calls, core genome phylogeny and multi-alignments. Parsnp leverages contextual information provided by multi-alignments surrounding SNP sites for filtration/cleaning, in addition to existing tools for recombination detection/filtration and phylogenetic reconstruction.
Other
126 stars 25 forks source link

Parsnp - Error: problem reading files from ./genomes (multifasta input) #92

Closed rmartischang closed 3 years ago

rmartischang commented 3 years ago

Hi,

I am running parsnp v1.2 from Ubuntu (using WSL2), to align one annotated isolate (.gbk) with 84 strains in multifasta format. Thus, the resulting command is:

parsnp -g Genomes_Ecoli_LOEX.FASTA/reference/contigs_MR1_annotated.gbk -m Genomes_Ecoli_LOEX.FASTA/genomes/*.fasta -c

-g because it is an annotated strain -m because of the multifasta format -c to force each strain of the folder as an input

And I got an error as a result. "problem reading files from ./genomes.." Do you know if there's an error in the command ? or maybe it is due to the version used ?

Warning: Cannot determine OS, defaulting to linux |--Parsnp v1.2--| For detailed documentation please see --> http://harvest.readthedocs.org/en/latest SETTINGS: |-refgenome: Genomes_Ecoli_LOEX.FASTA/reference/contigs_MR1_annotated.gbk.fna |-aligner: libMUSCLE |-seqdir: ./genomes |-outdir: /home/rmartischang/bioinformatic/P_2021_04_08_164000507664 |-OS: linux |-threads: 32 <> -->Reading Genome (asm, fasta) files from ./genomes.. ERROR: problem reading files from ./genomes

Many thanks

bkille commented 3 years ago

Hi @rmartischang, thanks for using parsnp!

The regular expression syntax you are using to pass the input fasta files is a feature of parsnp v1.5+ and is not available in 1.2. You can try instead running

parsnp -g Genomes_Ecoli_LOEX.FASTA/reference/contigs_MR1_annotated.gbk -m Genomes_Ecoli_LOEX.FASTA/genomes/ -c

if the Genomes_Ecoli_LOEX.FASTA/genomes/ directory only contains fasta files. However, we'd recommend you upgrade to version 1.5 as there are some additional bugfixes and 1.2 is no longer supported.

Thanks again and hope this helps!

-Bryce