currently when crass is run on metagenomic samples there are a large number of false positive groups are created. some of these groups appear to be SINEs, for example:
GCAACTGGGGCAACTGGGGCAACTGGGGCAACTGGGGCAACTGG
which has an obvious internal repeating structure and is definitely not a CRISPR direct repeat. Other groups have few reads and a low number of nodes in the spacer graph. It should be possible to filter out these groups by removing NodeManagers and DR groups with low numbers of reads and nodes.
currently when crass is run on metagenomic samples there are a large number of false positive groups are created. some of these groups appear to be SINEs, for example:
GCAACTGGGGCAACTGGGGCAACTGGGGCAACTGGGGCAACTGG
which has an obvious internal repeating structure and is definitely not a CRISPR direct repeat. Other groups have few reads and a low number of nodes in the spacer graph. It should be possible to filter out these groups by removing NodeManagers and DR groups with low numbers of reads and nodes.