Closed luciagrami closed 3 years ago
Hi Lucia,
PIRATE uses GNU parallel to run individual alignments using one thread in tandem. This is a large speed improvement over running MAFFT sequentially using a large number of threads per alignment. The number of threads is the same as provided to the main PIRATE script or set manually if run the align_feature_sequences.pl and create_pangenome_alignment.pl scripts are run outside of the PIRATE pipeline. Sometimes a problematic alignment, perhaps including large numbers of truncated or duplicated ORFs, can take a long time to complete. This can give the impression that one thread being used for MAFFT when in fact multiple have been used previously (i.e. it is only that one alignment waiting to complete). If this is the case then you might want to run align_feature_sequences.pl yourself and use more stringent cutoffs for --threshold, --max-threshold or --dosage values. Note that align_feature_sequences.pl can be run multiple times with no conflict, so you could align your core or accessory genes separately based upon your needs. Also, the alignment completes after other sections of the pipeline so technically PIRATE has already finalised your other outputs and you can use or analyse these while waiting for the alignments to complete.
All the best, Sion
Hello,
I want to know if it is possible to set the number of threads for mafft. I am working with ~1800 genomes, and the alignment step is taking too long, and is using only one cpu, so I would like to specify --threads for mafft. Is that possible?
Thanks!