spectrum_q_value is probably cheating and leaking some latent information about target/decoy class (#1).
Instead, we compute other features on the candidate peptide-spectrum level that are agnostic to target/decoy class:
matched_intensity_pct: percentage of total MS2 signal that the matched peaks account for,
delta_matched: difference between # of matched peaks of reported PSM, and the average peaks/candidate
scored_candidates: # of candidates actually scored
spectrum_z_score: z-score of candidate hyperscore vs median hyperscore of all candidates
Additional perf enhancements:
Remove HashMap from Scorer::score and pre-allocating a score vector based on the theoretical # of candidates that can be scored
spectrum_q_value
is probably cheating and leaking some latent information about target/decoy class (#1). Instead, we compute other features on the candidate peptide-spectrum level that are agnostic to target/decoy class:Additional perf enhancements:
HashMap
fromScorer::score
and pre-allocating a score vector based on the theoretical # of candidates that can be scoredsort_unstable
rather thansort
Perf on 4-core i5-4690k, 16 GB desktop