Closed dudimarcus closed 8 years ago
456ad68 should address this
Great! Thanks Ian.
It seems there is still some limiting threshold that controls the number of FFs being retrieved since even with high number of hits there are low number of FF and sometimes only one or two domains with many FFs, could it be the significance score?
Could you add a test case?
e.g file containing query sequence, command line usage, what you expected, what you get
On 6 Sep 2016 5:28 p.m., "David Marcus" notifications@github.com wrote:
Great! Thanks Ian.
It seems there is still some limiting threshold that still controls the number of FFs retrieved since even with high numbers of hits there are low number of FF and sometimes only one or two domains with many FFs, could it be the significance score?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sillitoe/cath-tools-seqscan/issues/3#issuecomment-244883775, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJVerezWIFMW0ip7leZZXCQG7kPwUccks5qnSQTgaJpZM4J1H7B .
I guess the question if there is something limiting the number of FFs other than hit limit like e-value? for example for uniprot id Q8E9X9 finds 10 FF with 1 domain:
HIT 3.40.50.1580/FF/5979 4.6e-143 Uridine phosphorylase HIT 3.40.50.1580/FF/2436 1.4e-44 Uridine phosphorylase HIT 3.40.50.1580/FF/4594 1.9e-34 Phosphorylase superfamily protein HIT 3.40.50.1580/FF/6092 2.1e-30 Purine nucleoside phosphorylase DeoD-type HIT 3.40.50.1580/FF/4309 4.3e-30 Purine or other phosphorylase family 1 HIT 3.40.50.1580/FF/3469 1.1e-23 Uridine phosphorylase HIT 3.40.50.1580/FF/6091 1.3e-15 Uridine phosphorylase 1, isoform CRA_a HIT 3.40.50.1580/FF/2937 4.7e-10 Uridine phosphorylase HIT 3.40.50.1580/FF/6120 3.1e-07 MTA/SAH nucleosidase HIT 3.40.50.1580/FF/5988 1.9e-04 Putative AMP nucleosidase
No, not aware of any other limit - that's just how many FunFams match. You wouldn't really want to go with much higher e-values than that anyway.
Bear in mind the FunFam HMMs are deliberately designed to be specific. When we are interested in increasing coverage, then we use HMMs built from jackhmmer
to catch more general matches. The inferences between the two types of matches are different though.
I might not fully understand your question.
What makes you think there might be a limit? Were you expecting more than those 10 FunFam hits?
No, so far all work as expected. I just wanted to make sure there is no other limitation for the number of hits.
Okay, cool. I'll close the ticket.
By the way, did Roman manage to sort out the Perl issues he was having? If not, it would be great if you could help him to get stuff working on whatever setup he was using (or let me know more details).
Currently the top 50 funfams are retrieved, any chance to control this or increase this number?