Should the CRRSANT style double-mapped reads be used?

SchwarzMarek commented 2 years ago

Hello, I'm interested in using the IRIS to predict secondary structures.

Firstly, I'm unsure about what is the best procedure to align reads to the RNA sequence (usually, reads are mapped to the whole genome). During searching for the optimal way, I've found the CRRSANT, with optimized setting for STAR aligner (https://github.com/zhipenglu/CRSSANT). What I'm unsure about is how to treat rearranged reads from step 3 (result of softreverse.py) can such reads be incorporated into the IRIS analysis, or is it only CRSSANT specific and such input should not be used with IRIS?

Thank you

JY-Zhou commented 2 years ago

Thank you for your concerns. :)

I guess it can be used directly in IRIS. IRIS simply takes as input the mapped PARIS reads as two distant segments on the RNA sequence (IRIS can extract them from the BAM file), which can be represented mathematically as

((LeftArm_start, LeftArm_end), (RightArm_start, RightArm_end))

For convenience, I've just uploaded our scripts to map PARIS reads using STAR, including some tweaks to the arguments. The major concern is that STAR was originally designed to align spliced reads, and it takes into account a lot of splicing biases as default arguments, which we should eliminate when dealing with PARIS reads. We use STAR in two steps: 1) indexing your reference RNA sequences of interest (lib/STAR/sample_STAR_indexing.sh) 2) maping all high-quality PARIS reads to it (lib/STAR/sample_STAR_mapping.sh)

Hope it help!

SchwarzMarek commented 2 years ago

Thank you for the answer and the examples provided.

JY-Zhou / IRIS

Should the CRRSANT style double-mapped reads be used? #1