This adds a new objective function for computing a guide set.
The objective that ADAPT had solved is to minimize the number of guides subject to constraints on coverage across the target genomes. This objective remains via the --obj minimize-guides option. It goes along with -gm (mismatches between guide and target), -gp (threshold on coverage), and --cover-by-year-decay. It is implemented now in the guide_search.GuideSearcherMinimizeGuides class.
The new objective function, accessible via --obj maximize-activity, maximizes the expected activity across the target genomes. We assume a uniform distribution, so this is the mean activity. This option requires that a predictive model be set; thus it does not use -gm, and it also does not use -gp because there is no set threshold on coverage. The objective adds penalties for guide set cardinality. There is a soft penalty, controlled by --soft-guide-constraint and --penalty-strength, and a hard penalty set by --hard-guide-constraint. The objective function is a non-negative non-monotone submodular function. This PR applies two algorithms to solve it: (1) the canonical greedy algorithm, which may work well in practice but does not provide theoretical guarantees because the function is non-monotone (unless --penalty-strength is 0); (2) a recent randomized greedy algorithm with worst-case guarantees for non-monotone submodular functions. It is implemented in the guide_search.GuideSearcherMaximizeActivity class.
This PR makes various changes to target_search.TargetSearcher that allow it to maximize an objective during a search for target options, with penalties on the number of primers and target length. As before, it can also optionally minimize an objective during a search.
This adds a new objective function for computing a guide set.
The objective that ADAPT had solved is to minimize the number of guides subject to constraints on coverage across the target genomes. This objective remains via the
--obj minimize-guides
option. It goes along with-gm
(mismatches between guide and target),-gp
(threshold on coverage), and--cover-by-year-decay
. It is implemented now in theguide_search.GuideSearcherMinimizeGuides
class.The new objective function, accessible via
--obj maximize-activity
, maximizes the expected activity across the target genomes. We assume a uniform distribution, so this is the mean activity. This option requires that a predictive model be set; thus it does not use-gm
, and it also does not use-gp
because there is no set threshold on coverage. The objective adds penalties for guide set cardinality. There is a soft penalty, controlled by--soft-guide-constraint
and--penalty-strength
, and a hard penalty set by--hard-guide-constraint
. The objective function is a non-negative non-monotone submodular function. This PR applies two algorithms to solve it: (1) the canonical greedy algorithm, which may work well in practice but does not provide theoretical guarantees because the function is non-monotone (unless--penalty-strength
is 0); (2) a recent randomized greedy algorithm with worst-case guarantees for non-monotone submodular functions. It is implemented in theguide_search.GuideSearcherMaximizeActivity
class.This PR makes various changes to
target_search.TargetSearcher
that allow it to maximize an objective during a search for target options, with penalties on the number of primers and target length. As before, it can also optionally minimize an objective during a search.