Closed gsn7 closed 3 years ago
We will investigate this. @euphemizm an idea? I think there are a sub-command and a file created for each input sequence, not for each profile.
To avoid lots of IO use pfscanV3 directly.
More details (thanks @beatrice79)
e.g. to query 10 sequences against whole prosite:
For cases with many sequences vs many motifs:
pf_scan.pl
uses by default pfscan. To query against UniParc maybe better to switch to pfsearch with
ps_scan.pl -w pfsearch
But as I said before pftoolsv3 are more optimized for such cases, and it is now available in ps_scan.pl (if I remember well)
@euphemizm correct me if I'm wrong:
ps_scan.pl -w pfsearchV3
or
ps_scan.pl --pfscan $PATH/pfscanV3
what is the best way to scan prosite patterns with pftools3. looking at the way ps_scan.pl deals with prosite patterns, without knowing much perl, it seems to me it creates a file for each profile and a file for each input sequence and then scan the sequences against the profile generating a third file for the results. that is a lot of IO. we run prosite patterns against UniParc so we would be happy to hear if there is a different way of scanning against prosite patterns, with less io overhead