Closed samlipworth closed 4 years ago
Hi @samlipworth
PIRATE runs some other scripts after those have completed. Was there are error messages printed to STDOUT? How did you install it and is it up to date with the most recent commits to master (--version). Could you run PIRATE again and provide it with a different output directory from your input directory? If this doesn't resolve your issue I am happy to run in on your GFFs and see if I can identify the issue.
S
Thanks yes will do
@SionBayliss thanks again for responding - this was my fault.. the cluster kicked my job off before it had finnised (didn't request enough time) I just hadn't appreciated that from the output. All running fine now and have all the output.
@samlipworth Great!
Hi Sion,
I have tried running PIRATE and it seems to have worked (in the log file there are no obvious errors) but then most of the output files you describe do not appear. Any idea why?
Cheers.
` - WARNING: R not found in system path, cannot use -r command.
PIRATE input options:
Standardising and checking input files:
Extracting pangenome sequences:
Constructing pangenome sequences:
Options:
Opening pan_sequences
/gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pan_sequences.fasta contains 22659607 sequences.
Passing 22659608 loci to cd-hit at 100%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.100 -aS 0.9 -c 1 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
Passing 22659608 loci to cd-hit at 99.5%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.99.5 -aS 0.9 -c 0.995 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
Passing 22659608 loci to cd-hit at 99%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.99 -aS 0.9 -c 0.99 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
Passing 22659608 loci to cd-hit at 98.5%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.98.5 -aS 0.9 -c 0.985 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
Passing 22659608 loci to cd-hit at 98%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.98 -aS 0.9 -c 0.98 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
completed in 16985 secs
0 core loci (0%)
22659608 non-core loci (100%)
433444 representative loci passed to blast.
running all-vs-all BLASTP on pan_sequences
completed in 33254 secs
running mcl on pan_sequences at 50
50379 clusters at 50 % - completed in 712 secs
running mcl on pan_sequences at 60
61760 clusters at 60 % - completed in 2838 secs
running mcl on pan_sequences at 70
74797 clusters at 70 % - completed in 2545 secs
running mcl on pan_sequences at 80
93099 clusters at 80 % - completed in 2300 secs
running mcl on pan_sequences at 90
133221 clusters at 90 % - completed in 2005 secs
running mcl on pan_sequences at 95
194809 clusters at 95 % - completed in 1744 secs
running mcl on pan_sequences at 98
368701 clusters at 98 % - completed in 1659 secs
reinflating clusters for pan_sequences
Finished
completed in: 64346s
Parsing pangenome files:
Processing 50% - 10689 paralogous gene clusters. Processing 60% - 11560 paralogous gene clusters. Processing 70% - 12329 paralogous gene clusters. Processing 80% - 13154 paralogous gene clusters. Processing 90% - 14112 paralogous gene clusters. Processing 95% - 14136 paralogous gene clusters. Processing 98% - 12281 paralogous gene clusters.
10689 paralog containing gene clusters detected. 4713 genomes processed.
Classifing paralogous clusters:
19508966 loci contained in 10689 clusters containing paralogs (base) [lipworth@rescomp2 PIRATE]$ cat pangenome_log.txt
Options:
Opening pan_sequences
/gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pan_sequences.fasta contains 22659607 sequences.
Passing 22659608 loci to cd-hit at 100%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.100 -aS 0.9 -c 1 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
Passing 22659608 loci to cd-hit at 99.5%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.99.5 -aS 0.9 -c 0.995 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
Passing 22659608 loci to cd-hit at 99%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.99 -aS 0.9 -c 0.99 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
Passing 22659608 loci to cd-hit at 98.5%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.98.5 -aS 0.9 -c 0.985 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
Passing 22659608 loci to cd-hit at 98%
command: "cd-hit -i /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.temp.fasta -o /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.98 -aS 0.9 -c 0.98 -T 20 -g 1 -n 5 -M 40731 -d 256 >> /gpfs2/well/bag/users/lipworth/gram_neg/PIRATE/pangenome_iterations/pan_sequences.cdhit_log.txt"
completed in 16985 secs
0 core loci (0%)
22659608 non-core loci (100%)
433444 representative loci passed to blast.
running all-vs-all BLASTP on pan_sequences
completed in 33254 secs
running mcl on pan_sequences at 50
50379 clusters at 50 % - completed in 712 secs
running mcl on pan_sequences at 60
61760 clusters at 60 % - completed in 2838 secs
running mcl on pan_sequences at 70
74797 clusters at 70 % - completed in 2545 secs
running mcl on pan_sequences at 80
93099 clusters at 80 % - completed in 2300 secs
running mcl on pan_sequences at 90
133221 clusters at 90 % - completed in 2005 secs
running mcl on pan_sequences at 95
194809 clusters at 95 % - completed in 1744 secs
running mcl on pan_sequences at 98
368701 clusters at 98 % - completed in 1659 secs
reinflating clusters for pan_sequences
Finished `
The only files this gives me is PIRATE.log
./co-ords/
genome_list.txt
loci_list.tab
pan_sequences.fasta
pangenome_log.txt
paralog_working
cluster_alleles.tab
genome2loci.tab
./genome_multifastas/
./modified_gffs/
./pangenome_iterations/
paralog_clusters.tab