YumaIshigami / irscalc

This repository contains analysis scripts for Ishigami et al. (2021) A single m6A modification in U6 snRNA diversifies exon sequence at the 5’ splice site. (Nature Communications)
MIT License
0 stars 0 forks source link

Alternate splicing analysis #3

Open VishnuPriyaKrishnan opened 2 years ago

VishnuPriyaKrishnan commented 2 years ago

Hi,

We are working on the downstream analysis of the knockdown of a gene involved in splicing whose role in pombe is known to facilitate alternate splicing events. We have run the script and proceeded with the intron retention analysis as described in your scripts. We are confused a little over the analysis for the alternate 5' SS and alternate 3'SS events.

The output file contains zscores for IRS, alt5ss, and alt3ss. Should we proceed with them the same way as done for IRS or is any different pipeline to be followed? In addition, we also noted that most of the sequences in "seqalt5", and "seqalt3" were "NA". Going through the scripts, "if" conditions were specifically present for generating "seqalt5" and "seqalt3" (irsseq.py from line 224). It would be great if you could please elaborate a bit more on it.

Thank you.

YumaIshigami commented 2 years ago

The original uploaded script commented out the lines to obtain z-scores and sequences for cryptic splice sites, because it was pointed out in the reviewing process that we should concentrate on intron retention. (See paragraph in page 15 from Reviewer # 1 starting with 'As one final note then,' in the Peer Review File of the Nat. Comm. 2021 paper). I assume you removed the #s and recovered the comments between lines 165 and 252, and also added the strings which were added to the keys of dictionary blist[] to lines 76-79, is it correct as the modifications you made in the script?

seqalt5 and seqalt3 are written on the result text file only when the cryptic splice site is supported by 10 or more reads, and if not, an empty string is written. Also, only the most frequent alternative splice site will be written. Were the "NA" s written on the raw text file when opened by a text editor, not by spreadsheet editors?

VishnuPriyaKrishnan commented 2 years ago

Screenshot_2022-10-25-02-07-53-881_com microsoft office powerpoint

Hello, Firstly, thank you for the reply. I had gone through the peer review file and understood the explanation why only IR was focused more. But we are interested in alternative splicing events because, we work on Cryptococcus neoformans, where 60% of genes have alternatively spliced transcripts. Hence, we ran the scripts by removing # and got the output and are in the process analysing it.

The output file, as you had mentioned contained seqalt5 and seqalt3 only for a subset of introns. So, if I'm right, even if there are three alternative sites, the only one which is the most preferred, will be there in the output? And yes, it was NA only in the text file while in Excel, the cell was empty.

I have one more doubt regarding the scatterplot when irsdiff of treatment and wildtype are plotted. In your case, it's very evident that the introns are biased towards the treatment side. But in our case, we find a plot where Q1 and Q3 are completely departed and lie in like a mirror image in graph ( I have attached an outline of how our results looked, between green and red is between, below green is less than q3(red) and above green line is greater than Q1(blue)). In this case, it's evident that even in wildtype there are events which are not spliced well( which happens in Cryptococcus). But in few of them the counts are 0 for the eijr/iejr and high in csjr. Will it still hold best to compare greater than Q1 and less than Q3 quartile?

Finally why didn't the paper mention anything about the branch point and it's consensus whether is affected or unaffected upon METLL16 deletion?

Thanks, -Vishnu


From: YumaIshigami @.> Sent: Tuesday, 25 October, 2022, 12:54 AM To: YumaIshigami/irscalc @.> Cc: VISHNU PRIYA KRISHNAN @.>; Author @.> Subject: Re: [YumaIshigami/irscalc] Alternate splicing analysis (Issue #3)

External Email

The original uploaded script commented out the lines to obtain z-scores and sequences for cryptic splice sites, because it was pointed out in the reviewing process that we should concentrate on intron retention. (See paragraph in page 15 from Reviewer # 1 starting with 'As one final note then,' in the Peer Review File of the Nat. Comm. 2021 paper). I assume you removed the #s and recovered the comments between lines 165 and 252, and also added the strings which were added to the keys of dictionary blist[] to lines 76-79, is it correct as the modifications you made in the script?

seqalt5 and seqalt3 are written on the result text file only when the cryptic splice site is supported by 10 or more reads, and if not, an empty string is written. Also, only the most frequent alternative splice site will be written. Were the "NA" s written on the raw text file when opened by a text editor, not by spreadsheet editors?

— Reply to this email directly, view it on GitHubhttps://github.com/YumaIshigami/irscalc/issues/3#issuecomment-1289491636, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3XJYT32DFTCPPXIXAWLOBDWE3O7FANCNFSM6AAAAAARKLXZBA. You are receiving this because you authored the thread.Message ID: @.***>