kvittingseerup / IsoformSwitchAnalyzeR

An R package to Identify, Annoatate and Visialize Isoform Switches with Functional Consequences (from RNA-seq data)
96 stars 18 forks source link

What are the SV values created in design matrix? #205

Closed ElvaWong closed 2 months ago

ElvaWong commented 11 months ago

Hi,

I am using R version 4.3.1, R studio version 2023.6.1.524, isoformSwitchAnalyzeR version 2.0.1. I imported Salmon dataset via Tximeta and saved as RData. I created aSwitchList using importRdata(). I realized that 6 columns of SV values were created in the design matrix of aSwitchList, and these 6 columns caused error in running isoformSwitchAnalysisPart1(). I then removed them and successfully ran isoformSwitchAnalysisPart1(). But, I am curious what are those SV values and is it alright to manually remove them?

# Contents of sampletable_transcript_simple
>head(sampletable_transcript_simple)
   sampleID    condition
1  PFASdil1      GenX.KO
2  PFASdil4   Control.KO
3  PFASdil7      GenX.WT
4  PFASdil9      GenX.WT
5 PFASdil10  PFOA_0.3.WT
6 PFASdil13 PFOA_0.05.WT
> 

#Create data frame of comparisons
condition_table <- data.frame(condition_1 = c("GenX.WT", "GenX.KO", "PFOA_0.3.WT", "PFOA_0.3.KO", "GenX.WT", "GenX.KO"),
            condition_2 = c("Control.WT", "Control.KO", "Control.WT", "Control.KO", "PFOA_0.3.WT", "PFOA_0.3.KO")
                                )

#Create aSwitchList 
aSwitchList <- importRdata(
    isoformCountMatrix   = txi.transcripts$counts,
    isoformRepExpression = txi.transcripts$abundance,
    designMatrix         = sampletable_transcript_simple,
    isoformExonAnnoation = "gencode.vM32.primary_assembly.annotation.gtf.gz",
    isoformNtFasta       = "gencode.vM32.transcripts.fa.gz",
    comparisonsToMake = condition_table,
    fixStringTieAnnotationProblem = TRUE,
    showProgress = FALSE
)

#results
>head(aSwitchList$designMatrix)
   sampleID   condition         sv1         sv2         sv3         sv4
1  PFASdil1     GenX.KO  0.15325191  0.24239127 -0.08131793 -0.10151465
2  PFASdil4  Control.KO  0.08471039  0.03236587 -0.06652344 -0.39165379
3  PFASdil7     GenX.WT -0.01521535 -0.02794922 -0.42179622  0.19699090
4  PFASdil9     GenX.WT  0.20620929 -0.05459255 -0.03070380  0.08931218
5 PFASdil10 PFOA_0.3.WT  0.05922495 -0.04698004 -0.03901371 -0.11490429
7 PFASdil16 PFOA_0.3.KO  0.23170368 -0.04637444 -0.04477648  0.11710328
          sv5         sv6
1  0.04056574  0.16095426
2  0.20737072  0.29886258
3  0.64551612  0.04753523
4 -0.02190115 -0.04855234
5 -0.14909152 -0.49669229
7 -0.02094741 -0.03449496
> 

#Run isoformSwitchAnalysisPart1
SwitchList_p1.1 <- isoformSwitchAnalysisPart1(
                                          switchAnalyzeRlist   = aSwitchList,
                                          outputSequences      = TRUE,
                                          prepareForWebServers = FALSE
                                           )

#Error
>Step 1 of 3 : Detecting isoform switches...
Error: BiocParallel errors
  1 remote errors, element index: 1
  0 unevaluated and other errors
  first remote error:
Error in estimateDispersionsGeneEst(x, maxit = maxit, quiet = quiet, modelMatrix = modelMatrix, : the number of samples and the number of model coefficients are equal,
  i.e., there are no replicates to estimate the dispersion.
  use an alternate design formula
>
kvittingseerup commented 11 months ago

Thanks for reporting this problem.

The sv's added are the surrogate variables found by running SVA - meaning factors of unwanted variations. Consider looking into this, as they could represent batch effects etc.

And is it correct that you don't have any replicates for your KO?

Cheers Kristoffer

P.s. please refrain from future cross-posting 🙂

ElvaWong commented 11 months ago

Thank you for your reply, and I will keep in mind not to cross-posting.

I have 3 replicates for all samples, including KO. I think the algorithm can recognize it, as I can perform isoformSwitchAnalysisPart1() after removing all SVs.

If SVs are from SVA, and it's normal to be added in matrix design, why is it causing an error in isoformSwitchAnalysisPart1()?

chunxubioinfor commented 2 months ago

Hi @ElvaWong , I don't know if you have resolved the problems. I think the issue is derived from the designMatrix. From your designMatrix, there are no replicates for KO. You can learn more in our vignette. I'm going to close this issue now, but you're welcome to open a new one at any time.😊