Closed hongsamL closed 6 years ago
Its likely an issue with your input dataset. Have you QC'd your input samples to ensure that your samples are what you think they are and removed contaminants? Have you QC'd the assemblies to exclude highly fragmented genomes, very large/small assemblies, assemblies with too many/few proteins etc.... Theres a lot of garbage out there. I've run it on a few thousand S.Typhi without any issue.
On 19 January 2018 at 12:09, S. Leandro H notifications@github.com wrote:
Hello, I am trying to run roary to get a core genome alignment of ~2500 salmonella genomes annotated with prokka, and I am unable to get the final outputs even though the cluster (PBS) returns a successful job completion. The command I am using is: roary -e -n -v -p 16 -f final2 gff/*.gff
I have tried using the same command on a test set of 11 genomes and I am able to obtain all output files using this number of files, but when I want to use all .gff files, the analysis stops after outputting:
6eK4xz49Hx accessory_binary_genes.fa accessory_binary_genes.fa.newick _accessory_clusters _accessory_clusters.clstr accessory_graph.dot blast_identity_frequency.Rtab _clustered _clustered.clstr clustered_proteins _combined_files _combined_files.groups core_accessory_graph.dot fxKSZdO5aQ gene_presence_absence.csv gene_presence_absence.Rtab _inflated_mcl_groups _inflated_unsplit_mcl_groups _labeled_mcl_groups _uninflated_mcl_groups
When I check the log files, it looks like the analysis stopped logging anything after running FastTree:
2018/01/18 14:31:10 Output directory created: final2 2018/01/18 14:31:10 Fixing input GFF files 2018/01/18 15:29:21 Extracting proteins from GFF files 2018/01/18 18:56:04 Running command: pan_genome_post_analysis -o clustered_proteins -p pan_genome.fa -s gene_presence_absence.csv -c _clustered.clstr --output_multifasta_files -i /panfs/roc/groups/2/alvarez0/ slhong/TEST_ROARY_BARD/FINAL/final2/zF44nvB4fd//_gff_files -f /panfs/roc/groups/2/alvarez0/slhong/TEST_ROARY_BARD/FINAL/ final2/zF44nvB4fd//_fasta_files -t 11 --dont_create_rplots -v --mafft -j Local --processors 16 --group_limit 50000 -cd 99 2018/01/18 18:56:07 Reinflate clusters 2018/01/18 18:56:35 Split groups with paralogs 2018/01/18 19:20:36 Labelling the groups 2018/01/18 19:20:38 Transfering the annotation to the groups 2018/01/18 20:29:46 Creating accessory binary gene presence and absence fasta 2018/01/18 23:18:39 Creating accessory binary gene presence and absence tree 2018/01/18 23:18:39 Running command: /home/alvarez0/slhong/ miniconda3/envs/roary/bin/FastTree -fastest -nt accessory_binary_genes.fa
accessory_binary_genes.fa.newick FastTree Version 2.1.9 Double precision (No SSE3) Alignment: accessory_binary_genes.fa Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000 Search: Fastest+2nd +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1 TopHits: 1.00*sqrtN close=default refresh=0.50 ML Model: Jukes-Cantor, CAT approximation with 20 rate categories 0.33 seconds: Top hits for 520 of 2499 seqs (at seed 100) 0.47 seconds: Top hits for 819 of 2499 seqs (at seed 300) 0.64 seconds: Top hits for 1157 of 2499 seqs (at seed 600) 0.75 seconds: Top hits for 1264 of 2499 seqs (at seed 900) 0.89 seconds: Top hits for 1434 of 2499 seqs (at seed 1100) 1.04 seconds: Top hits for 1827 of 2499 seqs (at seed 1300) 1.19 seconds: Top hits for 1933 of 2499 seqs (at seed 1500) 1.30 seconds: Top hits for 2034 of 2499 seqs (at seed 1700) 1.40 seconds: Top hits for 2203 of 2499 seqs (at seed 1800) 1.53 seconds: Top hits for 2406 of 2499 seqs (at seed 2000) 1.63 seconds: Top hits for 2482 of 2499 seqs (at seed 2300) 1.93 seconds: Joined 100 of 2496 2.28 seconds: Joined 200 of 2496 2.70 seconds: Joined 300 of 2496 3.13 seconds: Joined 400 of 2496 3.56 seconds: Joined 500 of 2496 3.92 seconds: Joined 600 of 2496 4.20 seconds: Joined 700 of 2496 4.59 seconds: Joined 800 of 2496 4.87 seconds: Joined 900 of 2496 5.26 seconds: Joined 1000 of 2496 5.57 seconds: Joined 1100 of 2496 5.80 seconds: Joined 1200 of 2496 6.15 seconds: Joined 1300 of 2496 6.33 seconds: Joined 1400 of 2496 6.63 seconds: Joined 1500 of 2496 6.98 seconds: Joined 1600 of 2496 7.25 seconds: Joined 1700 of 2496 7.59 seconds: Joined 1800 of 2496 7.77 seconds: Joined 1900 of 2496 8.06 seconds: Joined 2000 of 2496 8.26 seconds: Joined 2100 of 2496 8.41 seconds: Joined 2200 of 2496 8.62 seconds: Joined 2300 of 2496 8.77 seconds: Joined 2400 of 2496 Initial topology in 8.96 seconds Refining topology: 45 rounds ME-NNIs, 2 rounds ME-SPRs, 23 rounds ML-NNIs 8.95 seconds: ME NNI round 1 of 45, 1 of 2497 splits 9.06 seconds: ME NNI round 1 of 45, 601 of 2497 splits, 152 changes (max delta 0.036) 9.17 seconds: ME NNI rounting the message, thanks for letting me know. It doesnt split paralogs.d 1 of 45, 1201 of 2497 splits, 331 changes (max delta 0.036) 9.28 seconds: ME NNI round 1 of 45, 1801 of 2497 splits, 482 changes (max delta 0.049) 9.38 seconds: ME NNI round 1 of 45, 2401 of 2497 splits, 637 changes (max delta 0.049) 9.49 seconds: ME NNI round 2 of 45, 501 of 2497 splits, 74 changes (max delta 0.017) 9.59 seconds: ME NNI round 2 of 45, 1101 of 2497 splits, 183 changes (max delta 0.048) 9.71 seconds: ME NNI round 2 of 45, 1701 of 2497 splits, 296 changes (max delta 0.048) 9.81 seconds: ME NNI round 2 of 45, 2301 of 2497 splits, 396 changes (max delta 0.048) 9.91 seconds: ME NNI round 3 of 45, 401 of 2497 splits, 40 changes (max delta 0.020) 10.02 seconds: ME NNI round 3 of 45, 1001 of 2497 splits, 126 changes (max delta 0.023) 10.13 seconds: ME NNI round 3 of 45, 1601 of 2497 splits, 201 changes (max delta 0.023) 10.25 seconds: ME NNI round 4 of 45, 201 of 2497 splits, 19 changes (max delta 0.004) 10.35 seconds: ME NNI round 4 of 45, 801 of 2497 splits, 84 changes (max delta 0.026) 10.46 seconds: ME NNI round 4 of 45, 1401 of 2497 splits, 134 changes (max delta 0.026) 10.57 seconds: ME NNI round 5 of 45, 401 of 2497 splits, 25 changes (max delta 0.007) 10.68 seconds: ME NNI round 5 of 45, 1001 of 2497 splits, 74 changes (max delta 0.016) 10.78 seconds: ME NNI round 6 of 45, 301 of 2497 splits, 17 changes (max delta 0.004) 10.89 seconds: ME NNI round 6 of 45, 901 of 2497 splits, 54 changes (max delta 0.017) 11.00 seconds: ME NNI round 8 of 45, 1 of 2497 splits 11.12 seconds: ME NNI round 9 of 45, 201 of 2497 splits, 10 changes (max delta 0.007) 11.23 seconds: ME NNI round 13 of 45, 1 of 2497 splits 11.57 seconds: SPR round 1 of 2, 101 of 4996 nodes 11.88 seconds: SPR round 1 of 2, 201 of 4996 nodes 12.23 seconds: SPR round 1 of 2, 301 of 4996 nodes 12.54 seconds: SPR round 1 of 2, 401 of 4996 nodes 12.86 seconds: SPR round 1 of 2, 501 of 4996 nodes 13.17 seconds: SPR round 1 of 2, 601 of 4996 nodes 13.48 seconds: SPR round 1 of 2, 701 of 4996 nodes 13.81 seconds: SPR round 1 of 2, 801 of 4996 nodes 14.14 seconds: SPR round 1 of 2, 901 of 4996 nodes 14.42 seconds: SPR round 1 of 2, 1001 of 4996 nodes 14.75 seconds: SPR round 1 of 2, 1101 of 4996 nodes 15.06 seconds: SPR round 1 of 2, 1201 of 4996 nodes 15.36 seconds: SPR round 1 of 2, 1301 of 4996 nodes 15.72 seconds: SPR round 1 of 2, 1401 of 4996 nodes 16.11 seconds: SPR round 1 of 2, 1501 of 4996 nodes 16.41 seconds: SPR round 1 of 2, 1601 of 4996 nodes 16.74 seconds: SPR round 1 of 2, 1701 of 4996 nodes 17.12 seconds: SPR round 1 of 2, 1801 of 4996 nodes 17.43 seconds: SPR round 1 of 2, 1901 of 4996 nodes 17.75 seconds: SPR round 1 of 2, 2001 of 4996 nodes 18.10 seconds: SPR round 1 of 2, 2101 of 4996 nodes 18.45 seconds: SPR round 1 of 2, 2201 of 4996 nodes 18.79 seconds: SPR round 1 of 2, 2301 of 4996 nodes 19.11 seconds: SPR round 1 of 2, 2401 of 4996 nodes 19.46 seconds: SPR round 1 of 2, 2501 of 4996 nodes 19.80 seconds: SPR round 1 of 2, 2601 of 4996 nodes 20.11 seconds: SPR round 1 of 2, 2701 of 4996 nodes 20.46 seconds: SPR round 1 of 2, 2801 of 4996 nodes 20.76 seconds: SPR round 1 of 2, 2901 of 4996 nodes 21.06 seconds: SPR round 1 of 2, 3001 of 4996 nodes 21.36 seconds: SPR round 1 of 2, 3101 of 4996 nodes 21.66 seconds: SPR round 1 of 2, 3201 of 4996 nodes 21.98 seconds: SPR round 1 of 2, 3301 of 4996 nodes 22.34 seconds: SPR round 1 of 2, 3401 of 4996 nodes 22.69 seconds: SPR round 1 of 2, 3501 of 4996 nodes 23.02 seconds: SPR round 1 of 2, 3601 of 4996 nodes 23.37 seconds: SPR round 1 of 2, 3701 of 4996 nodes 23.71 seconds: SPR round 1 of 2, 3801 of 4996 nodes 24.00 seconds: SPR round 1 of 2, 3901 of 4996 nodes 24.29 seconds: SPR round 1 of 2, 4001 of 4996 nodes 24.63 seconds: SPR round 1 of 2, 4101 of 4996 nodes 24.92 seconds: SPR round 1 of 2, 4201 of 4996 nodes 25.30 seconds: SPR round 1 of 2, 4301 of 4996 nodes 25.60 seconds: SPR round 1 of 2, 4401 of 4996 nodes 25.91 seconds: SPR round 1 of 2, 4501 of 4996 nodes 26.19 seconds: SPR round 1 of 2, 4601 of 4996 nodes 26.55 seconds: SPR round 1 of 2, 4701 of 4996 nodes 26.94 seconds: SPR round 1 of 2, 4801 of 4996 nodes 27.29 seconds: SPR round 1 of 2, 4901 of 4996 nodes 27.59 seconds: ME NNI round 16 of 45, 1 of 2497 splits 27.70 seconds: ME NNI round 16 of 45, 701 of 2497 splits, 19 changes (max delta 0.005) 27.81 seconds: ME NNI round 16 of 45, 1301 of 2497 splits, 42 changes (max delta 0.069) 27.93 seconds: ME NNI round 16 of 45, 2001 of 2497 splits, 71 changes (max delta 0.069) 28.04 seconds: ME NNI round 17 of 45, 201 of 2497 splits, 2 changes (max delta 0.002) 28.15 seconds: ME NNI round 17 of 45, 901 of 2497 splits, 16 changes (max delta 0.012) 28.26 seconds: ME NNI round 17 of 45, 1501 of 2497 splits, 28 changes (max delta 0.022) 28.37 seconds: ME NNI round 17 of 45, 2201 of 2497 splits, 54 changes (max delta 0.022) 28.48 seconds: ME NNI round 18 of 45, 301 of 2497 splits, 6 changes (max delta 0.004) 28.59 seconds: ME NNI round 19 of 45, 201 of 2497 splits, 3 changes (max delta 0.001) 28.70 seconds: ME NNI round 21 of 45, 101 of 2497 splits, 0 changes 29.00 seconds: SPR round 2 of 2, 101 of 4996 nodes 29.31 seconds: SPR round 2 of 2, 201 of 4996 nodes 29.61 seconds: SPR round 2 of 2, 301 of 4996 nodes 29.91 seconds: SPR round 2 of 2, 401 of 4996 nodes 30.21 seconds: SPR round 2 of 2, 501 of 4996 nodes 30.54 seconds: SPR round 2 of 2, 601 of 4996 nodes 30.86 seconds: SPR round 2 of 2, 701 of 4996 nodes 31.13 seconds: SPR round 2 of 2, 801 of 4996 nodes 31.44 seconds: SPR round 2 of 2, 901 of 4996 nodes 31.80 seconds: SPR round 2 of 2, 1001 of 4996 nodes 32.10 seconds: SPR round 2 of 2, 1101 of 4996 nodes 32.39 seconds: SPR round 2 of 2, 1201 of 4996 nodes 32.71 seconds: SPR round 2 of 2, 1301 of 4996 nodes 33.01 seconds: SPR round 2 of 2, 1401 of 4996 nodes 33.36 seconds: SPR round 2 of 2, 1501 of 4996 nodes 33.64 seconds: SPR round 2 of 2, 1601 of 4996 nodes 34.02 seconds: SPR round 2 of 2, 1701 of 4996 nodes 34.33 seconds: SPR round 2 of 2, 1801 of 4996 nodes 34.68 seconds: SPR round 2 of 2, 1901 of 4996 nodes 35.01 seconds: SPR round 2 of 2, 2001 of 4996 nodes 35.31 seconds: SPR round 2 of 2, 2101 of 4996 nodes 35.59 seconds: SPR round 2 of 2, 2201 of 4996 nodes 35.92 seconds: SPR round 2 of 2, 2301 of 4996 nodes 36.25 seconds: SPR round 2 of 2, 2401 of 4996 nodes 36.63 seconds: SPR round 2 of 2, 2501 of 4996 nodes 36.93 seconds: SPR round 2 of 2, 2601 of 4996 nodes 37.28 seconds: SPR round 2 of 2, 2701 of 4996 nodes 37.61 seconds: SPR round 2 of 2, 2801 of 4996 nodes 37.90 seconds: SPR round 2 of 2, 2901 of 4996 nodes 38.21 seconds: SPR round 2 of 2, 3001 of 4996 nodes 38.49 seconds: SPR round 2 of 2, 3101 of 4996 nodes 38.78 seconds: SPR round 2 of 2, 3201 of 4996 nodes 39.09 seconds: SPR round 2 of 2, 3301 of 4996 nodes 39.37 seconds: SPR round 2 of 2, 3401 of 4996 nodes 39.66 seconds: SPR round 2 of 2, 3501 of 4996 nodes 40.03 seconds: SPR round 2 of 2, 3601 of 4996 nodes 40.30 seconds: SPR round 2 of 2, 3701 of 4996 nodes 40.65 seconds: SPR round 2 of 2, 3801 of 4996 nodes 41.03 seconds: SPR round 2 of 2, 3901 of 4996 nodes 41.32 seconds: SPR round 2 of 2, 4001 of 4996 nodes 41.62 seconds: SPR round 2 of 2, 4101 of 4996 nodes 41.90 seconds: SPR round 2 of 2, 4201 of 4996 nodes 42.19 seconds: SPR round 2 of 2, 4301 of 4996 nodes 42.47 seconds: SPR round 2 of 2, 4401 of 4996 nodes 42.77 seconds: SPR round 2 of 2, 4501 of 4996 nodes 43.07 seconds: SPR round 2 of 2, 4601 of 4996 nodes 43.40 seconds: SPR round 2 of 2, 4701 of 4996 nodes 43.73 seconds: SPR round 2 of 2, 4801 of 4996 nodes 44.14 seconds: SPR round 2 of 2, 4901 of 4996 nodes 44.45 seconds: ME NNI round 31 of 45, 1 of 2497 splits 44.56 seconds: ME NNI round 31 of 45, 701 of 2497 splits, 3 changes (max delta 0.001) 44.66 seconds: ME NNI round 31 of 45, 1301 of 2497 splits, 8 changes (max delta 0.001) 44.78 seconds: ME NNI round 31 of 45, 2001 of 2497 splits, 11 changes (max delta 0.001) 44.89 seconds: ME NNI round 32 of 45, 201 of 2497 splits, 0 changes 45.00 seconds: ME NNI round 32 of 45, 901 of 2497 splits, 3 changes (max delta 0.001) 45.11 seconds: ME NNI round 32 of 45, 1501 of 2497 splits, 6 changes (max delta 0.003) 45.22 seconds: ME NNI round 32 of 45, 2201 of 2497 splits, 12 changes (max delta 0.003) 45.33 seconds: ME NNI round 34 of 45, 101 of 2497 splits, 6 changes (max delta 0.001) Total branch-length 59.427 after 45.84 sec 45.92 seconds: ML Lengths 1 of 2497 splits 46.06 seconds: ML Lengths 101 of 2497 splits 46.19 seconds: ML Lengths 201 of 2497 splits 46.33 seconds: ML Lengths 301 of 2497 splits 46.47 seconds: ML Lengths 401 of 2497 splits 46.61 seconds: ML Lengths 501 of 2497 splits 46.74 seconds: ML Lengths 601 of 2497 splits 46.87 seconds: ML Lengths 701 of 2497 splits 47.01 seconds: ML Lengths 801 of 2497 splits 47.14 seconds: ML Lengths 901 of 2497 splits 47.28 seconds: ML Lengths 1001 of 2497 splits 47.43 seconds: ML Lengths 1101 of 2497 splits 47.57 seconds: ML Lengths 1201 of 2497 splits 47.71 seconds: ML Lengths 1301 of 2497 splits 47.85 seconds: ML Lengths 1401 of 2497 splits 47.99 seconds: ML Lengths 1501 of 2497 splits 48.13 seconds: ML Lengths 1601 of 2497 splits 48.27 seconds: ML Lengths 1701 of 2497 splits 48.41 seconds: ML Lengths 1801 of 2497 splits 48.55 seconds: ML Lengths 1901 of 2497 splits 48.69 seconds: ML Lengths 2001 of 2497 splits 48.82 seconds: ML Lengths 2101 of 2497 splits 48.96 seconds: ML Lengths 2201 of 2497 splits 49.10 seconds: ML Lengths 2301 of 2497 splits 49.24 seconds: ML Lengths 2401 of 2497 splits 49.37 seconds: ML NNI round 1 of 23, 1 of 2497 splits 49.88 seconds: ML NNI round 1 of 23, 101 of 2497 splits, 21 changes (max delta 57.168) 50.37 seconds: ML NNI round 1 of 23, 201 of 2497 splits, 42 changes (max delta 57.168) 50.87 seconds: ML NNI round 1 of 23, 301 of 2497 splits, 63 changes (max delta 57.168) 51.38 seconds: ML NNI round 1 of 23, 401 of 2497 splits, 85 changes (max delta 57.168) 51.88 seconds: ML NNI round 1 of 23, 501 of 2497 splits, 105 changes (max delta 57.168) 52.34 seconds: ML NNI round 1 of 23, 601 of 2497 splits, 121 changes (max delta 57.168) 52.82 seconds: ML NNI round 1 of 23, 701 of 2497 splits, 141 changes (max delta 57.168) 53.32 seconds: ML NNI round 1 of 23, 801 of 2497 splits, 155 changes (max delta 57.168) 53.81 seconds: ML NNI round 1 of 23, 901 of 2497 splits, 169 changes (max delta 57.168) 54.33 seconds: ML NNI round 1 of 23, 1001 of 2497 splits, 185 changes (max delta 130.486) 54.83 seconds: ML NNI round 1 of 23, 1101 of 2497 splits, 196 changes (max delta 130.486) 55.35 seconds: ML NNI round 1 of 23, 1201 of 2497 splits, 220 changes (max delta 130.486) 55.88 seconds: ML NNI round 1 of 23, 1301 of 2497 splits, 236 changes (max delta 130.486) 56.42 seconds: ML NNI round 1 of 23, 1401 of 2497 splits, 257 changes (max delta 130.486) 56.93 seconds: ML NNI round 1 of 23, 1501 of 2497 splits, 279 changes (max delta 130.486) 57.44 seconds: ML NNI round 1 of 23, 1601 of 2497 splits, 299 changes (max delta 130.486) 57.92 seconds: ML NNI round 1 of 23, 1701 of 2497 splits, 324 changes (max delta 130.486) 58.43 seconds: ML NNI round 1 of 23, 1801 of 2497 splits, 349 changes (max delta 130.486) 58.92 seconds: ML NNI round 1 of 23, 1901 of 2497 splits, 374 changes (max delta 130.486) 59.43 seconds: ML NNI round 1 of 23, 2001 of 2497 splits, 389 changes (max delta 130.486) 59.91 seconds: ML NNI round 1 of 23, 2101 of 2497 splits, 411 changes (max delta 130.486) 60.43 seconds: ML NNI round 1 of 23, 2201 of 2497 splits, 434 changes (max delta 130.486) 60.91 seconds: ML NNI round 1 of 23, 2301 of 2497 splits, 457 changes (max delta 130.486) 61.42 seconds: ML NNI round 1 of 23, 2401 of 2497 splits, 473 changes (max delta 130.486) ML-NNI round 1: LogLk = -753079.576 NNIs 497 max delta 130.49 Time 62.01 62.21 seconds: Site likelihoods with rate category 1 of 20 62.41 seconds: Site likelihoods with rate category 2 of 20 62.61 seconds: Site likelihoods with rate category 3 of 20 62.81 seconds: Site likelihoods with rate category 4 of 20 63.01 seconds: Site likelihoods with rate category 5 of 20 63.20 seconds: Site likelihoods with rate category 6 of 20 63.40 seconds: Site likelihoods with rate category 7 of 20 63.60 seconds: Site likelihoods with rate category 8 of 20 63.80 seconds: Site likelihoods with rate category 9 of 20 64.00 seconds: Site likelihoods with rate category 10 of 20 64.20 seconds: SitTotal time: 122.59 seconds Unique: 2499/2499 Bad splits: 19/2496 Worst delta-LogLk 16.592e likelihoods with rate category 11 of 20 64.40 seconds: Site likelihoods with rate category 12 of 20 64.60 seconds: Site likelihoods with rate category 13 of 20 64.80 seconds: Site likelihoods with rate category 14 of 20 65.00 seconds: Site likelihoods with rate category 15 of 20 65.20 seconds: Site likelihoods with rate category 16 of 20 65.40 seconds: Site likelihoods with rate category 17 of 20 65.60 seconds: Site likelihoods with rate category 18 of 20 65.80 seconds: Site likelihoods with rate category 19 of 20 66.00 seconds: Site likelihoods with rate category 20 of 20 Switched to using 20 rate categories (CAT approximation) Rate categories were divided by 1.021 so that average rate = 1.0 CAT-based log-likelihoods may not be comparable across runs Use -gamma for approximate but comparable Gamma(20) log-likelihoods 66.22 seconds: ML NNI round 2 of 23, 1 of 2497 splits 66.52 seconds: ML NNI round 2 of 23, 101 of 2497 splits, 12 changes (max delta 17.832) 66.87 seconds: ML NNI round 2 of 23, 201 of 2497 splits, 24 changes (max delta 17.832) 67.19 seconds: ML NNI round 2 of 23, 301 of 2497 splits, 34 changes (max delta 26.678) 67.54 seconds: ML NNI round 2 of 23, 401 of 2497 splits, 45 changes (max delta 26.678) 67.93 seconds: ML NNI round 2 of 23, 501 of 2497 splits, 59 changes (max delta 26.678) 68.19 seconds: ML NNI round 2 of 23, 601 of 2497 splits, 69 changes (max delta 26.678) 68.52 seconds: ML NNI round 2 of 23, 701 of 2497 splits, 83 changes (max delta 30.027) 68.77 seconds: ML NNI round 2 of 23, 801 of 2497 splits, 88 changes (max delta 30.027) 69.04 seconds: ML NNI round 2 of 23, 901 of 2497 splits, 96 changes (max delta 48.316) 69.46 seconds: ML NNI round 2 of 23, 1001 of 2497 splits, 109 changes (max delta 48.316) 69.80 seconds: ML NNI round 2 of 23, 1101 of 2497 splits, 120 changes (max delta 48.316) 70.11 seconds: ML NNI round 2 of 23, 1201 of 2497 splits, 131 changes (max delta 48.316) 70.41 seconds: ML NNI round 2 of 23, 1301 of 2497 splits, 139 changes (max delta 48.316) 70.74 seconds: ML NNI round 2 of 23, 1401 of 2497 splits, 148 changes (max delta 48.316) 71.06 seconds: ML NNI round 2 of 23, 1501 of 2497 splits, 158 changes (max delta 48.316) 71.42 seconds: ML NNI round 2 of 23, 1601 of 2497 splits, 166 changes (max delta 48.316) 71.78 seconds: ML NNI round 2 of 23, 1701 of 2497 splits, 179 changes (max delta 48.316) 72.17 seconds: ML NNI round 2 of 23, 1801 of 2497 splits, 193 changes (max delta 48.316) 72.55 seconds: ML NNI round 2 of 23, 1901 of 2497 splits, 206 changes (max delta 57.978) 72.87 seconds: ML NNI round 2 of 23, 2001 of 2497 splits, 211 changes (max delta 57.978) 73.15 seconds: ML NNI round 2 of 23, 2101 of 2497 splits, 216 changes (max delta 57.978) 73.56 seconds: ML NNI round 2 of 23, 2201 of 2497 splits, 233 changes (max delta 57.978) 73.83 seconds: ML NNI round 2 of 23, 2301 of 2497 splits, 238 changes (max delta 57.978) 74.13 seconds: ML NNI round 2 of 23, 2401 of 2497 splits, 247 changes (max delta 57.978) ML-NNI round 2: LogLk = -687521.759 NNIs 251 max delta 57.98 Time 74.54 74.54 seconds: ML NNI round 3 of 23, 1 of 2497 splits 74.82 seconds: ML NNI round 3 of 23, 101 of 2497 splits, 11 changes (max delta 17.886) 75.10 seconds: ML NNI round 3 of 23, 201 of 2497 splits, 19 changes (max delta 19.082) 75.34 seconds: ML NNI round 3 of 23, 301 of 2497 splits, 20 changes (max delta 19.082) 75.59 seconds: ML NNI round 3 of 23, 401 of 2497 splits, 28 changes (max delta 19.082) 75.89 seconds: ML NNI round 3 of 23, 501 of 2497 splits, 36 changes (max delta 19.082) 76.13 seconds: ML NNI round 3 of 23, 601 of 2497 splits, 41 changes (max delta 19.388) 76.39 seconds: ML NNI round 3 of 23, 701 of 2497 splits, 45 changes (max delta 20.055) 76.67 seconds: ML NNI round 3 of 23, 801 of 2497 splits, 48 changes (max delta 20.055) 76.96 seconds: ML NNI round 3 of 23, 901 of 2497 splits, 51 changes (max delta 41.174) 77.21 seconds: ML NNI round 3 of 23, 1001 of 2497 splits, 57 changes (max delta 41.174) 77.55 seconds: ML NNI round 3 of 23, 1101 of 2497 splits, 66 changes (max delta 41.174) 77.79 seconds: ML NNI round 3 of 23, 1201 of 2497 splits, 70 changes (max delta 41.174) 78.07 seconds: ML NNI round 3 of 23, 1301 of 2497 splits, 74 changes (max delta 41.174) 78.35 seconds: ML NNI round 3 of 23, 1401 of 2497 splits, 81 changes (max delta 41.174) 78.58 seconds: ML NNI round 3 of 23, 1501 of 2497 splits, 85 changes (max delta 41.174) 78.85 seconds: ML NNI round 3 of 23, 1601 of 2497 splits, 94 changes (max delta 41.174) 79.13 seconds: ML NNI round 3 of 23, 1701 of 2497 splits, 101 changes (max delta 41.174) 79.46 seconds: ML NNI round 3 of 23, 1801 of 2497 splits, 111 changes (max delta 41.174) ML-NNI round 3: LogLk = -687049.069 NNIs 112 max delta 41.17 Time 79.71 79.71 seconds: ML NNI round 4 of 23, 1 of 2497 splits 79.96 seconds: ML NNI round 4 of 23, 101 of 2497 splits, 2 changes (max delta 7.313) 80.21 seconds: ML NNI round 4 of 23, 201 of 2497 splits, 5 changes (max delta 7.313) 80.45 seconds: ML NNI round 4 of 23, 301 of 2497 splits, 7 changes (max delta 8.331) 80.71 seconds: ML NNI round 4 of 23, 401 of 2497 splits, 12 changes (max delta 8.331) 80.89 seconds: ML NNI round 4 of 23, 501 of 2497 splits, 13 changes (max delta 8.331) 81.12 seconds: ML NNI round 4 of 23, 601 of 2497 splits, 14 changes (max delta 8.331) 81.36 seconds: ML NNI round 4 of 23, 701 of 2497 splits, 20 changes (max delta 9.333) 81.58 seconds: ML NNI round 4 of 23, 801 of 2497 splits, 21 changes (max delta 9.333) 81.85 seconds: ML NNI round 4 of 23, 901 of 2497 splits, 23 changes (max delta 9.333) 82.07 seconds: ML NNI round 4 of 23, 1001 of 2497 splits, 24 changes (max delta 9.333) 82.40 seconds: ML NNI round 4 of 23, 1101 of 2497 splits, 35 changes (max delta 9.333) ML-NNI round 4: LogLk = -686921.587 NNIs 40 max delta 10.56 Time 82.76 82.75 seconds: ML NNI round 5 of 23, 1 of 2497 splits 82.96 seconds: ML NNI round 5 of 23, 101 of 2497 splits, 1 changes (max delta 0.245) 83.14 seconds: ML NNI round 5 of 23, 201 of 2497 splits, 2 changes (max delta 1.507) 83.36 seconds: ML NNI round 5 of 23, 301 of 2497 splits, 5 changes (max delta 2.481) 83.61 seconds: ML NNI round 5 of 23, 401 of 2497 splits, 9 changes (max delta 2.481) 83.89 seconds: ML NNI round 5 of 23, 501 of 2497 splits, 15 changes (max delta 7.746) 84.24 seconds: ML NNI round 5 of 23, 601 of 2497 splits, 27 changes (max delta 7.746) ML-NNI round 5: LogLk = -686867.766 NNIs 29 max delta 7.75 Time 84.41 84.41 seconds: ML NNI round 6 of 23, 1 of 2497 splits 84.61 seconds: ML NNI round 6 of 23, 101 of 2497 splits, 2 changes (max delta 2.293) 84.83 seconds: ML NNI round 6 of 23, 201 of 2497 splits, 3 changes (max delta 2.293) 85.17 seconds: ML NNI round 6 of 23, 301 of 2497 splits, 13 changes (max delta 4.604) ML-NNI round 6: LogLk = -686838.422 NNIs 14 max delta 4.60 Time 85.38 85.38 seconds: ML NNI round 7 of 23, 1 of 2497 splits 85.57 seconds: ML NNI round 7 of 23, 101 of 2497 splits, 0 changes 85.90 seconds: ML NNI round 7 of 23, 201 of 2497 splits, 4 changes (max delta 6.404) ML-NNI round 7: LogLk = -686829.512 NNIs 5 max delta 6.40 Time 86.06 86.05 seconds: ML NNI round 8 of 23, 1 of 2497 splits 86.35 seconds: ML NNI round 8 of 23, 101 of 2497 splits, 2 changes (max delta 1.301) ML-NNI round 8: LogLk = -686827.281 NNIs 2 max delta 1.30 Time 86.50 86.50 seconds: ML NNI round 9 of 23, 1 of 2497 splits ML-NNI round 9: LogLk = -686827.167 NNIs 0 max delta 0.00 Time 86.77 Turning off heuristics for final round of ML NNIs (converged) 86.77 seconds: ML NNI round 10 of 23, 1 of 2497 splits 87.25 seconds: ML NNI round 10 of 23, 101 of 2497 splits, 1 changes (max delta 5.480) 87.78 seconds: ML NNI round 10 of 23, 201 of 2497 splits, 1 changes (max delta 5.480) 88.29 seconds: ML NNI round 10 of 23, 301 of 2497 splits, 1 changes (max delta 5.480) 88.81 seconds: ML NNI round 10 of 23, 401 of 2497 splits, 3 changes (max delta 5.480) 89.34 seconds: ML NNI round 10 of 23, 501 of 2497 splits, 5 changes (max delta 5.480) 89.88 seconds: ML NNI round 10 of 23, 601 of 2497 splits, 8 changes (max delta 12.860) 90.40 seconds: ML NNI round 10 of 23, 701 of 2497 splits, 12 changes (max delta 12.860) 90.92 seconds: ML NNI round 10 of 23, 801 of 2497 splits, 18 changes (max delta 12.860) 91.45 seconds: ML NNI round 10 of 23, 901 of 2497 splits, 20 changes (max delta 12.860) 91.98 seconds: ML NNI round 10 of 23, 1001 of 2497 splits, 21 changes (max delta 12.860) 92.53 seconds: ML NNI round 10 of 23, 1101 of 2497 splits, 26 changes (max delta 21.285) 93.06 seconds: ML NNI round 10 of 23, 1201 of 2497 splits, 28 changes (max delta 21.285) 93.55 seconds: ML NNI round 10 of 23, 1301 of 2497 splits, 34 changes (max delta 21.285) 94.08 seconds: ML NNI round 10 of 23, 1401 of 2497 splits, 35 changes (max delta 21.285) 94.60 seconds: ML NNI round 10 of 23, 1501 of 2497 splits, 42 changes (max delta 26.975) 95.10 seconds: ML NNI round 10 of 23, 1601 of 2497 splits, 43 changes (max delta 26.975) 95.61 seconds: ML NNI round 10 of 23, 1701 of 2497 splits, 44 changes (max delta 26.975) 96.10 seconds: ML NNI round 10 of 23, 1801 of 2497 splits, 50 changes (max delta 26.975) 96.64 seconds: ML NNI round 10 of 23, 1901 of 2497 splits, 55 changes (max delta 26.975) 97.16 seconds: ML NNI round 10 of 23, 2001 of 2497 splits, 58 changes (max delta 26.975) 97.64 seconds: ML NNI round 10 of 23, 2101 of 2497 splits, 59 changes (max delta 26.975) 98.17 seconds: ML NNI round 10 of 23, 2201 of 2497 splits, 60 changes (max delta 26.975) 98.67 seconds: ML NNI round 10 of 23, 2301 of 2497 splits, 64 changes (max delta 26.975) 99.19 seconds: ML NNI round 10 of 23, 2401 of 2497 splits, 67 changes (max delta 26.975) ML-NNI round 10: LogLk = -686475.227 NNIs 72 max delta 26.97 Time 99.80 (final) 99.80 seconds: ML Lengths 1 of 2497 splits 99.91 seconds: ML Lengths 101 of 2497 splits 100.03 seconds: ML Lengths 201 of 2497 splits 100.15 seconds: ML Lengths 301 of 2497 splits 100.27 seconds: ML Lengths 401 of 2497 splits 100.38 seconds: ML Lengths 501 of 2497 splits 100.50 seconds: ML Lengths 601 of 2497 splits 100.62 seconds: ML Lengths 701 of 2497 splits 100.75 seconds: ML Lengths 801 of 2497 splits 100.87 seconds: ML Lengths 901 of 2497 splits 100.98 seconds: ML Lengths 1001 of 2497 splits 101.11 seconds: ML Lengths 1101 of 2497 splits 101.23 seconds: ML Lengths 1201 of 2497 splits 101.34 seconds: ML Lengths 1301 of 2497 splits 101.46 seconds: ML Lengths 1401 of 2497 splits 101.58 seconds: ML Lengths 1501 of 2497 splits 101.70 seconds: ML Lengths 1601 of 2497 splits 101.82 seconds: ML Lengths 1701 of 2497 splits 101.93 seconds: ML Lengths 1801 of 2497 splits 102.05 seconds: ML Lengths 1901 of 2497 splits 102.17 seconds: ML Lengths 2001 of 2497 splits 102.29 seconds: ML Lengths 2101 of 2497 splits 102.40 seconds: ML Lengths 2201 of 2497 splits 102.52 seconds: ML Lengths 2301 of 2497 splits 102.63 seconds: ML Lengths 2401 of 2497 splits Optimize all lengths: LogLk = -686443.280 Time 102.85 103.63 seconds: ML split tests for 100 of 2496 internal splits 104.44 seconds: ML split tests for 200 of 2496 internal splits 105.23 seconds: ML split tests for 300 of 2496 internal splits 106.01 seconds: ML split tests for 400 of 2496 internal splits 106.81 seconds: ML split tests for 500 of 2496 internal splits 107.60 seconds: ML split tests for 600 of 2496 internal splits 108.40 seconds: ML split tests for 700 of 2496 internal splits 109.19 seconds: ML split tests for 800 of 2496 internal splits 109.98 seconds: ML split tests for 900 of 2496 internal splits 110.79 seconds: ML split tests for 1000 of 2496 internal splits 111.61 seconds: ML split tests for 1100 of 2496 internal splits 112.42 seconds: ML split tests for 1200 of 2496 internal splits 113.17 seconds: ML split tests for 1300 of 2496 internal splits 113.98 seconds: ML split tests for 1400 of 2496 internal splits 114.77 seconds: ML split tests for 1500 of 2496 internal splits 115.53 seconds: ML split tests for 1600 of 2496 internal splits 116.33 seconds: ML split tests for 1700 of 2496 internal splits 117.08 seconds: ML split tests for 1800 of 2496 internal splits 117.89 seconds: ML split tests for 1900 of 2496 internal splits 118.68 seconds: ML split tests for 2000 of 2496 internal splits 119.45 seconds: ML split tests for 2100 of 2496 internal splits 120.23 seconds: ML split tests for 2200 of 2496 internal splits 121.03 seconds: ML split tests for 2300 of 2496 internal splits 121.82 seconds: ML split tests for 2400 of 2496 internal splits Total time: 122.59 seconds Unique: 2499/2499 Bad splits: 19/2496 Worst delta-LogLk 16.592
I installed roary through bioconda, and I had previously used it successfully to obtain core genomes for smaller amounts of genomes. My roary -a output is:
Please cite Roary if you use any of the results it produces: Andrew J. Page, Carla A. Cummins, Martin Hunt, Vanessa K. Wong, Sandra Reuter, Matthew T. G. Holden, Maria Fookes, Daniel Falush, Jacqueline A. Keane, Julian Parkhill, "Roary: Rapid large-scale prokaryote pan genome analysis", Bioinformatics, 2015 Nov 15;31(22):3691-3693 doi: http://doi.org/10.1093/bioinformatics/btv421 Pubmed: 26198102 2018/01/19 06:05:55 Looking for 'Rscript' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/Rscript 2018/01/19 06:05:55 Determined Rscript version is 3.3 2018/01/19 06:05:55 Looking for 'awk' - found /usr/bin/awk 2018/01/19 06:05:55 Looking for 'bedtools' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/bedtools 2018/01/19 06:05:55 Determined bedtools version is 2.26 2018/01/19 06:05:55 Looking for 'blastp' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/blastp 2018/01/19 06:05:56 Determined blastp version is 2.6.0 2018/01/19 06:05:56 Looking for 'grep' - found /bin/grep 2018/01/19 06:05:56 Optional tool 'kraken' not found in your $PATH 2018/01/19 06:05:56 Optional tool 'kraken-report' not found in your $PATH 2018/01/19 06:05:56 Looking for 'mafft' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/mafft 2018/01/19 06:05:57 Determined mafft version is 7.310 2018/01/19 06:05:57 Looking for 'makeblastdb' - found /home/alvarez0/slhong/miniconda3/envs/roary/bin/makeblastdb 2018/01/19 06:05:57 Determined makeblastdb version is 2.6.0 2018/01/19 06:05:57 Looking for 'mcl' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/mcl 2018/01/19 06:05:58 Determined mcl version is 14-137 2018/01/19 06:05:58 Looking for 'parallel' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/parallel 2018/01/19 06:06:01 Determined parallel version is 20170422 2018/01/19 06:06:01 Looking for 'prank' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/prank 2018/01/19 06:06:01 Looking for 'sed' - found /bin/sed 2018/01/19 06:06:01 Looking for 'cd-hit' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/cd-hit 2018/01/19 06:06:01 Determined cd-hit version is 4.7 2018/01/19 06:06:01 Looking for 'FastTree' - found /home/alvarez0/slhong/ miniconda3/envs/roary/bin/FastTree 2018/01/19 06:06:01 Determined FastTree version is 2.1 2018/01/19 06:06:01 Roary version 3.9.1 2018/01/19 06:06:01 Error: You need to provide at least 2 files to build a pan genome
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sanger-pathogens/Roary/issues/383, or mute the thread https://github.com/notifications/unsubscribe-auth/AABeV1oinfr2FjHId3Ebg5Oz1dj1VjJnks5tMIX3gaJpZM4RkYvG .
I will do a more in depth QC assessment, rerun it, and let you know, thanks!
Hello Andrew, I have QC'd my assemblies and I am still unable to run roary to completion. However, if I partition the dataset into two random subsets, I am able to run roary to completion in each of the reduced datasets (~1100 genomes). I have repeated this process with two other random partitions, and in every case I am able to run roary without any problems. Do you have any idea of where the issue might be?
Could you send me the gene presence and absence CSV file?
On 27 Jan 2018 7:29 pm, "S. Leandro H" notifications@github.com wrote:
Hello Andrew, I have QC'd my assemblies and I am still unable to run roary to completion. However, if I partition the dataset into two random subsets, I am able to run roary to completion in each of the reduced datasets (~1100 genomes). I have repeated this process with two other random partitions, and in every case I am able to run roary without any problems. Do you have any idea of where the issue might be?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sanger-pathogens/Roary/issues/383#issuecomment-361008846, or mute the thread https://github.com/notifications/unsubscribe-auth/AABeV7Bhqh8BssZGcqm8IzgWf25CqfRCks5tO3j8gaJpZM4RkYvG .
Hi Andrew,
I have tried running it again just to be sure, but I do not get any presence and absence file with this dataset. Here is the list of output files that I end up getting.
drwx------. 2 slhong 324K Jan 29 04:28 96k5RfmeX0/ -rw-------. 1 slhong 4.4M Jan 29 10:57 accessory_binary_genes.fa -rw-------. 1 slhong 99K Jan 29 11:02 accessory_binary_genes.fa.newick -rw-------. 1 slhong 295K Jan 29 11:07 _accessory_clusters -rw-------. 1 slhong 86K Jan 29 11:07 _accessory_clusters.clstr -rw-------. 1 slhong 56 Jan 29 04:03 blast_identity_frequency.Rtab -rw-------. 1 slhong 12M Jan 29 03:53 _clustered -rw-------. 1 slhong 346M Jan 29 03:53 _clustered.clstr -rw-------. 1 slhong 181M Jan 29 06:43 clustered_proteins -rw-------. 1 slhong 2.7G Jan 29 03:32 _combined_files -rw-------. 1 slhong 45M Jan 29 03:32 _combined_files.groups -rw-------. 1 slhong 181M Jan 29 04:39 _inflated_mcl_groups -rw-------. 1 slhong 181M Jan 29 04:04 _inflated_unsplit_mcl_groups -rw-------. 1 slhong 181M Jan 29 04:39 _labeled_mcl_groups drwx------. 2 slhong 4.0K Jan 29 04:38 paMDQIjRCp/ -rw-------. 1 slhong 804K Jan 29 04:04 _uninflated_mcl_groups
Hello Andrew,
I was able to fix the problem by running roary with more RAM allocated. The issue was not with roary, but from the PBS scheduler killing my job without returning an out of memory error.
Hello, I am trying to run roary to get a core genome alignment of ~2500 salmonella genomes annotated with prokka, and I am unable to get the final outputs even though the cluster (PBS) returns a successful job completion. The command I am using is:
roary -e -n -v -p 16 -f final2 gff/*.gff
I have tried using the same command on a test set of 11 genomes and I am able to obtain all output files using this number of files, but when I want to use all .gff files, the analysis stops after outputting:
When I check the log files, it looks like the analysis stopped logging anything after running FastTree:
I installed roary through bioconda, and I had previously used it successfully to obtain core genomes for smaller amounts of genomes. My
roary -a
output is: