Closed jolespin closed 7 months ago
Hi @jolespin, included and reported correspond to HMMER threshold (-E
specifies reporting threshold by E-value, --incE
specifies inclusion thresholds by E-value).
Usually in a hmmsearch
run all hits that you get pass the reporting thresholds (Hit.reported == True
) and you can ignore the inclusion thresholds.
Where this is actually useful is for jackhmmer
, where inclusion thresholds control which hits get included to build the HMM for the next iteration.
By default when you run hmmsearch
, there is always a threshold, but it's -E 10.0
and --incE 10.0
so you're virtually including all relevant hits + up to 10 false positives.
Thank you, this definitely answers my question. Rewriting all my pipelines to use this. It's so much faster and I have way more control than before.
I'm looking through the documentation and trying to understand what included and reported mean: https://pyhmmer.readthedocs.io/en/stable/api/plan7.html#pyhmmer.plan7.Hit.included
The docs say
Whether this hit is marked as included.
andWhether this hit is marked as reported.
but I'm not sure what this means.Does it mean that hit was determined based on an e-value (reported) and if it passed some threshold (e.g., gathering) then it would be marked as included? If so, how does this work when there are no thresholds specified?