There are parameters that the user can choose for the algorithm:
thorny or non-thorny
peatmer vs. simple k-mer
k-mer length
strictly adjacent vs extended adjacency (i.e., 1st, 2nd, 3rd match of peatmer)
Like most scientific algorithms, the defaults should be a defensible choice obtained by comparison against a standard. There is expected tradeoff on completeness (fraction of genes in a match) vs colinearity (the parameter optimized by existing aligners such as DAGchainer). We are going to document how large this tradeoff is and justify a value, and we will emphasize collinearity where possible.
Proposed standard for matching is DAGchainer rather than minimap2, unless there is a reason otherwise.
Proposed data set is glycine7 rather than glycine33. If there is a choice for a second set to compare, it might be a good plan to choose a set that includes a genome with a large number of small scaffolds where the choices are likely sharper.
There are parameters that the user can choose for the algorithm:
Like most scientific algorithms, the defaults should be a defensible choice obtained by comparison against a standard. There is expected tradeoff on completeness (fraction of genes in a match) vs colinearity (the parameter optimized by existing aligners such as DAGchainer). We are going to document how large this tradeoff is and justify a value, and we will emphasize collinearity where possible.
Proposed standard for matching is DAGchainer rather than minimap2, unless there is a reason otherwise.
Proposed data set is glycine7 rather than glycine33. If there is a choice for a second set to compare, it might be a good plan to choose a set that includes a genome with a large number of small scaffolds where the choices are likely sharper.