Closed cjfields closed 3 years ago
The newest version includes non-chimeric reads that have soft-clips and are also discordant (key features from reads around large insertion sites). We likely won't include split reads since these are not typically hallmarks of unique insertions but represent large-scale rearrangements.
We are seeing very few contigs coming through the current workflow, though I believe these can be attributed to two key factors:
I am performing a test realignment of two data sets against the GRCh38_noalts reference that is available (this is one of the prebuilt assemblies available from the bowtie2 site), then running a side-by-side comparison. As a note: both versions seemed to ignore trimming, but here we will include this to make sure there are no residual adapters the assembly.