Closed ksmcu closed 10 months ago
Hi @ksmcu ,
Thanks for trying PSI-Sigma. Number of events = 0
suggests that the SJ.out.tab files might be empty or in the wrong format (could be due to Chromosome prefix chr
). Could you use less
to see if the SJ.out.tab files have content? Or, are the SJ.out.tab files using the right prefix? Is the Homo_sapiens.GRCh38.107.sorted.gtf
using chr
as prefix of Chromosome? If you can show a few lines in the SJ.out.tab file, it will be more clear to me.
Best, Woody
Hi Woody,
Thanks for your quick reply. I used the same GTF file to run both short-read RNA-seq and long-read RNA-seq, and it worked fine with short-read RNA-seq. Here is a screenshot of the GTF file:
You are right. The SJ file for long-read RNA-seq has a 'chr' prefix, but the SJ file produced by STAR does not have the prefix 'chr.' Below is a screenshot of the .SJ.out.tab file:
For short-read RNA-seq, I used STAR to output the SJ file and used it as input, along with the BAM file, for psi-sigma. For the long-read data, I used the aligned file generated by minimap2 as the sole input for psi-sigma. The SJ file is generated by psi-sigma. However, my GTF file doesn't have a 'chr' prefix. Do you have any ideas on how to fix this issue?
Thank you so much!
Hi @ksmcu ,
If you can redo alignment with minimap2 by using exactly the same genomic DNA .fasta file (used to build STAR index), it should fix the problem. Otherwise, you can duplicate Homo_sapiens.GRCh38.107.sorted.gtf
and remove chr
by using sed 's/^chr//' input.gtf > output.gtf
. Or, you can re-download a .gtf without chr
from Ensembl. Any of these should work. :)
Best, Woody
Hi Woody,
Thank you! I re-did the alignment, and it works. I have another question: I'm using psi-sigma to calculate the psi values for each individual alternative splicing event from single-cell short-read RNA-seq and single-cell long-read RNA-seq data. I don't need the delta psi value. Right now, I'm just separating the samples and putting them into 'groupa.txt' and 'groupb.txt' files. For the individual event psi values, can I refer to the "N Values" and "T Values" columns in the 'XXX_r10_ir3.txt' file?
Best
Hi @ksmcu ,
Yes, the N values and T values are showing PSI values of groupa and groupb samples, respectively. You may want to use the _r10_ir3.sorted.txt
file instead of _r10_ir3.txt
because the sorted file has better annotation of the events.
Best, Woody
Hi @ksmcu ,
I forgot to mention. The order of values in the N values
column is sorted by the order of samples in groupa.txt
file.
Best, Woody
Hi woody,
Thank you. Since I only want to obtain the psi value for a single sample, but psi-sigma requires groupa.txt and groupb.txt, I copied the sample and renamed it to be used as a sample in groupb.txt. This way, my N values and T values will be the same. However, if I use different samples in groupa.txt and groupb.txt, the events I can detect may yield slightly different results compared to using the same sample in both groups. I'm just wondering if the sample in groupb will influence the detection of alternative splicing events in groupa?"
Best Regards, R
Hi @ksmcu ,
The PSI values should not be affected by which samples in groupa.txt or groupb.txt because a PSI value of a sample is calculated based on only the reads from that sample. If you can elaborate a bit further, maybe I can understand better.
Best, Woody
Hi @ksmcu ,
Have you issue been resolved?
Thanks, Woody
Hi Woody,
The issue was resolved. Thank you!
Hi Woody,
Thank you for such amazing tools. I was trying to use PSI-sigma to analyze my short-read RNA-seq and long-read RNA-seq data. I was able to successfully run it with my short-read RNA-seq data, but for the long-read RNA-seq dataset, I got zero events and my .IR.out.tab. file is empty. My long-read RNA-seq dataset is aligned with minimap2, and I was also able to run ./testsample. Below are my log results and my command. I would appreciate it if you could help me with it.
Best Regards, Rhonda
code: perl /mnt/data/rhonda/Apps/PSI-Sigma-2.1/dummyai.pl --gtf /mnt/data/rhonda/tras/Homo_sapiens.GRCh38.107.sorted.gtf --name PSIsigma --type 2 -nread 10
run_log.txt