kvittingseerup / IsoformSwitchAnalyzeR

An R package to Identify, Annoatate and Visialize Isoform Switches with Functional Consequences (from RNA-seq data)
96 stars 18 forks source link

Problem in importRdata #202

Closed RamonSLPS closed 1 year ago

RamonSLPS commented 1 year ago

I am trying to import 12 datasets (4 conditions) to perform alternative splicing analysis. But I am getting this error in the importRdata step:

Step 6 of 10: Batch correcting expression estimates... Coefficients not estimable: sv2 Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': 'x' is singular: singular fits are not implemented in 'rlm'

my script until this part:

quantsDir <- "/home/ramon/Documents/PIBIC/myoMYO/salmon" gtfDir <- "/home/ramon/Documents/PIBIC/myoMYO/auxDatasets/ensembl/myoLUC.gtf" cdnaDir <- "/home/ramon/Documents/PIBIC/myoMYO/auxDatasets/ensembl/myoLUC_cdna.fa" rootDir <- "/home/ramon/Documents/PIBIC/myoMYO"

quants.df <- importIsoformExpression( parentDir=quantsDir, addIsofomIdAsColumn=TRUE ) design.df <- data.frame( sampleID=c("SRR23991165", "SRR23991166", "SRR23991167", "SRR23991159", "SRR23991163", "SRR23991160", "SRR23991161", "SRR23991162", "SRR23991164", "SRR23991156", "SRR23991157", "SRR23991158"), condition=c("WAKE_ctrl", "WAKE_ctrl", "WAKE_ctrl","WAKE_inf", "WAKE_inf", "WAKE_inf", "TRPD_ctrl", "TRPD_ctrl", "TRPD_ctrl", "TRPD_inf", "TRPD_inf", "TRPD_inf") ) comp.df <- data.frame( condition_1=c("WAKE_ctrl", "TRPD_ctrl", "WAKE_inf"), condition_2=c("WAKE_inf", "TRPD_inf", "TRPD_inf") ) aSwitchlist <- importRdata( isoformCountMatrix=quants.df$counts, isoformRepExpression=quants.df$abundance, designMatrix=design.df, isoformExonAnnoation=gtfDir, isoformNtFasta=cdnaDir, comparisonsToMake=comp.df, fixStringTieAnnotationProblem=TRUE )

What does this error means and how to fix it? Its a problem in my code or in my datasets? Obs: Ive tried "requantifying" but still got the same error.

Thanks in advance, Ramon Lopes.

RamonSLPS commented 1 year ago

Apparently, this was an issue in my salmon index as more than one group showed this error after using it. I dont know if it is a problem in the refseq sequences in ensembl or in my salmon index creation process.

kvittingseerup commented 1 year ago

From the IsoformSwitchAnalyzeR error message, I'm not so sure. Why do you think it was an index problem?

RamonSLPS commented 1 year ago

Because I tried to quantify another group of samples using the same index, and the program returned the same error. Then I tried to make another salmon index with the same ensembl transcripts file, and again the same error. So, at last, I tried to make an index from a different transcripts file I obtained from NCBI in the first group of datasets and it worked well.

kvittingseerup commented 1 year ago

That sounds really strange! Which ensemble files did you use (and which worked and which did not)?

RamonSLPS commented 1 year ago

First I used the REFseq cDNA file for microbrats (Myotis Lucifugus) from ENSEMBL, which did not work. At last, I used a Transcripts file for Myotis Myotis (the species that I am studying) from NCBI (MyoMyo1.0) and it worked well.

kvittingseerup commented 1 year ago

I still don't think that is the problem why IsoformSwitchAnalyzeR failed. Would you be willing to share your first itteration of the data with me so I could debut this? I would naturally keep it confidential and delete it after debugging. If yes, could you email me the datasets you give as input to importRdata()?

RamonSLPS commented 1 year ago

I have already deleted the "broken" datasets and index, I only have the new made by another index. but I can send you the raw datasets, cDNA and GTF files I am using if you want. So you could try to index and quantify with the ENSEMBL cDNA file I think its broken. Would it help?

kvittingseerup commented 1 year ago

Nope - but thanks for offering it 🙂