sib-swiss / pftools3

A suite of tools to build and search generalized profiles
GNU General Public License v2.0
10 stars 7 forks source link

using ps_scan.pl to scan prosite patterns #16

Closed gsn7 closed 3 years ago

gsn7 commented 4 years ago

what is the best way to scan prosite patterns with pftools3. looking at the way ps_scan.pl deals with prosite patterns, without knowing much perl, it seems to me it creates a file for each profile and a file for each input sequence and then scan the sequences against the profile generating a third file for the results. that is a lot of IO. we run prosite patterns against UniParc so we would be happy to hear if there is a different way of scanning against prosite patterns, with less io overhead

smoretti commented 3 years ago

We will investigate this. @euphemizm an idea? I think there are a sub-command and a file created for each input sequence, not for each profile.

To avoid lots of IO use pfscanV3 directly.

smoretti commented 3 years ago

More details (thanks @beatrice79)

e.g. to query 10 sequences against whole prosite:

For cases with many sequences vs many motifs:

pf_scan.pl uses by default pfscan. To query against UniParc maybe better to switch to pfsearch with ps_scan.pl -w pfsearch

But as I said before pftoolsv3 are more optimized for such cases, and it is now available in ps_scan.pl (if I remember well) @euphemizm correct me if I'm wrong: ps_scan.pl -w pfsearchV3 or ps_scan.pl --pfscan $PATH/pfscanV3