ksahlin / strobealign

Aligns short reads using dynamic seed size with strobemers
MIT License
128 stars 16 forks source link

Using Strobealign for aligning RNAseq to recodonized gene #387

Open Rohit-Satyam opened 5 months ago

Rohit-Satyam commented 5 months ago

Hi

I have RNA-Seq data for a gene knockout produced by recodonization (see here for meaning because this is new to me). This gene has 4 exons but since the recodonization was done, therefore, it is hard to say which one falls where. I am in a bit of a situation where I have to make coverage maps to see if the knockout worked or not and since a major portion of the gene was recodonized (codons were changed), I thought of planning the RNA-Seq data directly to this recodonized sequence because altering the genome would be tricky for a gene. So I came across your paper and thought of using your aligner for this. Below is the result of a comparison of your aligner with minimap2 for this Illumina PE data with 150bp read length.

image

I also tried to map the reads to the reference genome where the codons were not modified for this gene and tried to view the coverage for this gene in IGV and it looks like as follows:

image

I was told that this isn't the right approach since the recodonization will make RNA with different sequences (but will translate to the same protein sequence) and therefore will fail to map to the reference genomic region.

marcelm commented 5 months ago

Hi, I’m probably missing some context since this recodonization is also new to me, but can you clarify what your question is?

It seems that strobealign doesn’t give you coverage of the first two exons in contrast to minimap2, is that the observation you wanted to share? Strobealign has some limitations at the moment when mapping very short sequences, maybe that is the issue here.