sanger-pathogens / Roary

Rapid large-scale prokaryote pan genome analysis
http://sanger-pathogens.github.io/Roary
Other
323 stars 189 forks source link

BLAST Database error #321

Closed aweimann closed 7 years ago

aweimann commented 7 years ago

Hi Andrew,

I'm running Roary on three genomes, which I annotated with Prokka. Apparently the BLAST steps fails, although the program still finishes. The combined genes from the three genomes end up as the core genome. Do you have any idea what's going on there?

Cheers, Aaron

roary -f roary_out -e -n -v prokkaout/gff

2017/04/03 15:14:46 Output directory name exists already so adding a timestamp to the end 2017/04/03 15:14:46 Output directory created: roary_out_1491225286 2017/04/03 15:14:46 Fixing input GFF files 2017/04/03 15:14:49 Extracting proteins from GFF files Extracting proteins from /home/aaron/prokka_out/PROKKA_04032017.gff Extracting proteins from /home/aaron/prokka_PA14_out/PROKKA_04032017.gff Extracting proteins from /home/aaron/prokka_PAO_out/PROKKA_04032017.gff Combine proteins into a single file Iteratively run cd-hit Parallel all against all blast BLAST Database error: No alias or index file found for protein database [/home/aaron/roary_out_1491225286/J3RyMjOrGu/output_contigs] in search path [/home/aaron/roary_out_1491225286::] Cluster with MCL 2017/04/03 15:15:06 Running command: pan_genome_post_analysis -o clustered_proteins -p pan_genome.fa -s gene_presence_absence.csv -c _clustered.clstr --output_multifasta_files -i /home/aaron/roary_out_1491225286/vhFjmCR9Ey//_gff_files -f /home/aaron/roary_out_1491225286/vhFjmCR9Ey//_fasta_files -t 11 --dont_create_rplots -v --mafft -j Local --processors 1 --group_limit 50000 -cd 99 Use of uninitialized value in require at /usr/lib/perl/5.18/Encode.pm line 60. 2017/04/03 15:15:06 Reinflate clusters 2017/04/03 15:15:06 Split groups with paralogs 2017/04/03 15:15:07 Labelling the groups 2017/04/03 15:15:07 Transfering the annotation to the groups 2017/04/03 15:15:11 Creating accessory binary gene presence and absence fasta 2017/04/03 15:15:11 Creating accessory binary gene presence and absence tree 2017/04/03 15:15:11 The input file is too small so not creating a tree 2017/04/03 15:15:11 Creating accessory gene presence and absence clusters 2017/04/03 15:15:11 Theres no accessory binary file so skipping accessory binary clustering 2017/04/03 15:15:11 Creating the spreadsheet with gene presence and absence 2017/04/03 15:15:19 Creating summary statistics of the spreadsheet 2017/04/03 15:15:24 Creating tab files for R 2017/04/03 15:15:25 Create EMBL files 2017/04/03 15:15:26 Creating files with the nucleotide sequences for every cluster 2017/04/03 15:15:34 Cleaning up files Aligning each cluster Use of uninitialized value in require at (eval 2091) line 1. 2017/04/03 15:15:34 Running command: pan_genome_core_alignment -cd 99 2017/04/03 15:15:34 pan_genome_core_alignment -cd 99

--------------------- WARNING --------------------- MSG: Got a sequence without letters. Could not guess alphabet

--------------------- WARNING --------------------- MSG: Got a sequence without letters. Could not guess alphabet

--------------------- WARNING --------------------- MSG: Got a sequence without letters. Could not guess alphabet

Output of roary -a is

2017/04/03 15:07:41 Looking for 'Rscript' - found /usr/bin/Rscript 2017/04/03 15:07:41 Determined Rscript version is 3.0 2017/04/03 15:07:41 Looking for 'awk' - found /usr/bin/awk 2017/04/03 15:07:41 Looking for 'bedtools' - found /usr/bin/bedtools 2017/04/03 15:07:41 Determined bedtools version is 2.17 2017/04/03 15:07:41 Looking for 'blastp' - found /usr/bin/blastp 2017/04/03 15:07:41 Determined blastp version is 2.2.28 2017/04/03 15:07:41 Looking for 'grep' - found /bin/grep 2017/04/03 15:07:41 Optional tool 'kraken' not found in your $PATH 2017/04/03 15:07:41 Optional tool 'kraken-report' not found in your $PATH 2017/04/03 15:07:41 Looking for 'mafft' - found /usr/bin/mafft Use of uninitialized value in concatenation (.) or string at /usr/local/share/perl/5.18.2/Bio/Roary/External/CheckTools.pm line 129. 2017/04/03 15:07:42 Determined mafft version is 2017/04/03 15:07:42 Looking for 'makeblastdb' - found /usr/bin/makeblastdb 2017/04/03 15:07:42 Determined makeblastdb version is 2.2.28 2017/04/03 15:07:42 Looking for 'mcl' - found /usr/bin/mcl 2017/04/03 15:07:42 Determined mcl version is 12-135 2017/04/03 15:07:42 Looking for 'parallel' - found /usr/bin/parallel 2017/04/03 15:07:42 Determined parallel version is 20130922 2017/04/03 15:07:42 Looking for 'prank' - found /usr/bin/prank 2017/04/03 15:07:42 Looking for 'sed' - found /bin/sed 2017/04/03 15:07:42 Looking for 'cdhit' - found /usr/bin/cdhit 2017/04/03 15:07:42 Determined cdhit version is 4.6 2017/04/03 15:07:42 Looking for 'fasttree' - found /usr/bin/fasttree 2017/04/03 15:07:42 Determined fasttree version is 2.1 2017/04/03 15:07:42 Roary version 3.8.0 2017/04/03 15:07:42 Error: You need to provide at least 2 files to build a pan genome

aweimann commented 7 years ago

It appears to work now. Maybe the path to the Blast db wasn't set properly before.