FSUgenomics / SRSFseq

SRSF shape analysis framework for sequencing data
GNU General Public License v3.0
0 stars 2 forks source link

How best to use SRSFseq #1

Open TomSmithCGAT opened 6 years ago

TomSmithCGAT commented 6 years ago

Hi @Wesserg - After reading your recent publication in Computational Biology and Chemistry, I'm interested in applying SRSFseq to RNA-seq data where I'm expecting clear (but relatively small) differences between the shape profiles over genes. For example, the attached IGV screengrab shows two examples of control vs. treatment where there is a clear reduced coverage in the 3' UTR at the ACTB gene. In the lower example, there is also a less prominent reduction in coverage towards the end of the largest exon. I have 4 replicates for all conditions and these patterns are consistent. Overall, I'm expecting the majority of such changes to be in 3' UTRs.

Do you think SFSRseq would be an appropriate method to identify such differences, and if so what's the best approach to do so? From what I gather, SRSFseq is not available as a stable R package. Do you have any plans to add this to Bioconductor for instance? In the meantime, I'll work from the code available in this repository

fig1.pdf

Wesserg commented 6 years ago

Hi @TomSmithCGAT - Thank you for your interest and questions.

  1. SRSFseq indeed is not yet a package but we are working on it. We are adding some DNAseq features for 3D chromatin shape analysis. Unfortunately the full package won't be available at least till the end of this year.

  2. This being said I think SRSFseq could be helpful in your case, if you are willing to cope with editing the provided R-script. I would be more than happy to help with that - this should be a simple adjustment.

SRSFseq is designed to capture difference in coverage distribution in RNAseq data while NEGLECTING the differences in depth of the coverage. If that is what you are looking for in the 3' UTRs (or the last exon) SRSFseq could be the way to go. In particular, looking at the picture you have sent me, SRSFseq should capture those differences.

  1. What would you need: RNAseq reads in sorted, indexed bam files, a GTF file with specified the regions that you are interested in. I don't know if such GTF file exists - we might need to create it manually by replacing exon coordinates in a generic GTF file with UTR coordinates (or just using the last exon coordinates).