Closed qiuyixmm closed 5 years ago
You can, but SMRT-SV is likely going to miss some really important regions where chimpanzee and human have diverged (e.g. large segmental duplications). You should still get a callset with useful variation in it.
It might be useful to run a whole genome assembly. Polish with Arrow, then polish with Pilon if you have short reads. Align contigs back to the reference and call SVs with PrintGaps.py (it's in the SMRT-SV repository). You'll have to handle duplicate calls where two contigs overlap. This method is likely to give you a fuller set of SVs in regions where the species have diverged significantly. You can mine the SMRT-SV code for tips running PrintGaps. I would use minimap2 for the contig alignments (you'll have to tune some of the parameters though).
Find me on https://eichlerlab.gs.washington.edu/curmem.html and message me directly if you need me to additional help with this.
For example, can i use the long reads of chimpanzee to call SV against human genome using smrtsv2 ? If so, it would affect the results of calling? Thanks!