GMOD / jbrowse-components

Source code for JBrowse 2, a modern React-based genome browser
https://jbrowse.org/jb2
Apache License 2.0
199 stars 61 forks source link

Enhancing JBrowse 2's Linear Synteny View with CIGAR-based Grid Lines for Aligned Regions #4300

Open lybCNU opened 4 months ago

lybCNU commented 4 months ago

Is your feature request related to a problem? Please describe. I've encountered a problem when using JBrowse 2's linear synteny view to observe synteny results generated by minimap2. The issue arises when dealing with long syntenic regions, as it becomes challenging to maintain detail and orientation when focusing on specific genes. This makes it difficult to accurately align and compare specific regions across different genomes, especially when looking for minor structural variations or similarities between them.

Describe the solution you'd like I would like JBrowse 2 to introduce a feature similar to the "grid lines for aligned regions" found in the traditional GBrowse_syn for its linear synteny view. Specifically, I wish for the linear synteny view to be able to read the CIGAR information from PAF files generated by minimap2 and use this to generate grid lines for aligned regions. This enhancement would allow users to clearly see the boundaries of each aligned region, facilitating easy focus on specific genes or areas even within extensive syntenic regions.

Describe alternatives you've considered Aside from directly supporting CIGAR information in the linear synteny view, I've also considered the possibility of preprocessing the PAF file with external tools to mark the boundaries of aligned regions before importing it into JBrowse. However, this approach is indirect and adds complexity for the user. Implementing this feature directly within JBrowse would be more convenient and intuitive for users.

Additional context Minimap2 is a widely used sequence alignment tool, and its output PAF files contain valuable alignment information, including CIGAR strings. The CIGAR information is crucial for understanding the alignment relationships between sequences, helping researchers accurately identify insertions, deletions, and substitutions between sequences. To facilitate the parsing of CIGAR strings within JBrowse 2, I suggest considering the use of the Bio::Cigar Perl module, which is designed to assist in CIGAR string analysis. This could potentially streamline the development of the proposed feature by leveraging existing libraries designed for sequence analysis. Enhancing JBrowse 2's linear synteny view with this capability would significantly increase its utility for genomic alignment analysis.

cmdcolin commented 4 months ago

JBrowse 2 does actually support CIGAR strings. if you generate a PAF file with "minimap2 -c" then it will include the CIGAR string, and jbrowse 2 will render the insertions and deletions

example

image

share link https://jbrowse.org/code/jb2/main/?config=test_data%2Fyeast_synteny%2Fconfig.json&session=share-COunWGvXpG&password=fpdn9

there are still challenges though. for example

lybCNU commented 4 months ago

Thank you very much for your detailed and informative response. I'm pleased to learn that JBrowse 2 already supports the rendering of insertions and deletions from PAF files generated with minimap2 using the "-c" option. The reference lines for these insertions and deletions indeed serve a similar purpose to the 'accessory grid functions', providing significant help in understanding the nuances of alignments. This functionality is a great step forward, and I look forward to any further enhancements in this area.