JY-Zhou / IRIS

A method for predicting ensembles of In vivo RNA secondary structures using PARIS data
http://iris.zhanglab.net/
GNU General Public License v3.0
2 stars 0 forks source link

Should the CRRSANT style double-mapped reads be used? #1

Closed SchwarzMarek closed 2 years ago

SchwarzMarek commented 2 years ago

Hello, I'm interested in using the IRIS to predict secondary structures.

Firstly, I'm unsure about what is the best procedure to align reads to the RNA sequence (usually, reads are mapped to the whole genome). During searching for the optimal way, I've found the CRRSANT, with optimized setting for STAR aligner (https://github.com/zhipenglu/CRSSANT). What I'm unsure about is how to treat rearranged reads from step 3 (result of softreverse.py) can such reads be incorporated into the IRIS analysis, or is it only CRSSANT specific and such input should not be used with IRIS?

Thank you

JY-Zhou commented 2 years ago

Thank you for your concerns. :)

I guess it can be used directly in IRIS. IRIS simply takes as input the mapped PARIS reads as two distant segments on the RNA sequence (IRIS can extract them from the BAM file), which can be represented mathematically as

((LeftArm_start, LeftArm_end), (RightArm_start, RightArm_end))

For convenience, I've just uploaded our scripts to map PARIS reads using STAR, including some tweaks to the arguments. The major concern is that STAR was originally designed to align spliced reads, and it takes into account a lot of splicing biases as default arguments, which we should eliminate when dealing with PARIS reads. We use STAR in two steps: 1) indexing your reference RNA sequences of interest (lib/STAR/sample_STAR_indexing.sh) 2) maping all high-quality PARIS reads to it (lib/STAR/sample_STAR_mapping.sh)

Hope it help!

SchwarzMarek commented 2 years ago

Thank you for the answer and the examples provided.