This PR adds a new parameter to anvi-get-sequences-for-hmm-hits, called --ignore-genes-longer-than. Here is the help menu entry for this parameter:
(...)
--ignore-genes-longer-than MAX_LENGTH
In some cases the gene calling step can
identify open reading frames that span across
extremely long stretches of genomes. Such
mistakes can lead to downstream issues,
especially when concatenate flag is used,
including failure to align genes. This flag
allows anvi'o to ignore extremely long gene
calls to avoid unintended issues (i.e., during
phylogenomic analyses). If you use this flag,
please carefully examine the output messages
from the program to see which genes are
removed from the analysis. Please note that
the length parameter considers the nucleotide
lenght of the open reading frame, even if you
asked for amino acid sequences to be returned.
Setting this parameter to small values, such
as less than 10000 nucleotides may lead to the
removal of genuine genes, so please use it
carefully.
This PR adds a new parameter to
anvi-get-sequences-for-hmm-hits
, called--ignore-genes-longer-than
. Here is the help menu entry for this parameter:Closes #2200.