sanger-pathogens / Roary

Rapid large-scale prokaryote pan genome analysis
http://sanger-pathogens.github.io/Roary
Other
323 stars 189 forks source link

Length threshold for gene blastp #316

Closed cjdoyler closed 6 years ago

cjdoyler commented 7 years ago

Hi, This is not an issue, more of a "query". I was wondering would it be possible to set a length threshold for the blastp, like the identity threshold.

I am applying this to look at the core genes of a genus. I know this is not the intended purpose of Roary but it would be useful to be able to establish how many genes are core to my genus of interest.

Any help or suggestions that you give would be greatly appreciated. Thanks.

Conor

zinque commented 7 years ago

I agree with this suggestion.

I would like to be able to separate also based on size. In the organisms I work on many pseudogenes are present (as fragments of the original gene) and I would like to separate between original and pseudogene fragments by applying a length or coverage cutoff value.

Is this a feature you are considering?

-Martin

andrewjpage commented 7 years ago

Thank you for the suggestion. I believe another group is developing an enhancement with this functionality, so to avoid duplication of effort, we will not add this. You can of course edit the code and submit a pull request, which would be greatly appreciated.

gabuali commented 6 years ago

Hi - I use Roary quite a bit for species-level pangenome analysis, and am also interested in this feature. Has there been an update, ie has the enhancement been developed by the other group and is it publicly available at this point?

Many thanks,

Galeb