Closed tseemann closed 8 years ago
I've found this code: https://github.com/sanger-pathogens/Roary/blob/056512409fcb0e817cf16ae554792816b80b9356/lib/Bio/Roary/AccessoryBinaryFasta.pm
And it seems besides the 4000 gene limit, there is some 5% upper and lower bound, which i assume trims clusters that have membership numbers too low or too high?
Is there a way to script / parameter this from the command line tools?
Hi Torsten,
I wanted to cap the size of the file sent into FastTree since it can be memory hungry. Running a few tests, I think I may have been a bit too cautious here. My original thinking was to focus on getting the general high level groupings in a reasonable order (hence getting rid of the top and bottom 5%). I'll remove this restriction and see how things go. Andrew
On 20 January 2016 at 01:56, Torsten Seemann notifications@github.com wrote:
I've found this code: https://github.com/sanger-pathogens/Roary/blob/056512409fcb0e817cf16ae554792816b80b9356/lib/Bio/Roary/AccessoryBinaryFasta.pm
And it seems besides the 4000 gene limit, there is some 5% upper and lower bound, which i assume trims clusters that have membership numbers too low or too high?
Is there a way to script / parameter this from the command line tools?
— Reply to this email directly or view it on GitHub https://github.com/sanger-pathogens/Roary/issues/225#issuecomment-173059231 .
Thanks!
Nullarbor now produces pan-genome trees and they seem to have more resolution now.
Hi @andrewjpage ,
is it already possible to get the accessory_binary_genes.fa
from the gene_presence_absence.csv
file using any script from the command line?
Thanks
The manual says:
Based on some data we have tried, it seems that singleton clusters do NOT end up in the file?
eg. 10 samples, mostly clonal, but 1 with a plasmid, causes tree to be mostly flat, no sites in the .fa file for the plasmid genes.