Closed Pinolinoo closed 2 years ago
That sounds very strange. Could you try running the individual functions that isoformSwitchAnalysisPart1 ()
wraps and send me the output? These are the individual functions
### Filter
mySwitchList <- preFilter( mySwitchList )
### Test for isoform switches
mySwitchList <- isoformSwitchTestDEXSeq( mySwitchList )
### Add ORF
mySwitchList <- addORFfromGTF( mySwitchList )
mySwitchList <- analyzeNovelIsoformORF( mySwitchList )
HI Kristoffer, Thanks for your quick reply!
Here's the output:
aSwitchList <- preFilter(aSwitchList) The filtering removed 42174 ( 38.25% of ) transcripts. There is now 68090 isoforms left
> aSwitchList <- isoformSwitchTestDEXSeq(aSwitchList) Step 1 of 2: Testing each pairwise comparisons with DEXSeq (this might be a bit slow)... Estimated run time is: 56.3 min Step 2 of 2: Integrating result into switchAnalyzeRlist... Isoform switch analysis was performed for 9440 gene comparisons (100%). Total runtime: 54.57 min Done
then in the 3rd the error comes again:
> aSwitchList <- addORFfromGTF(aSwitchList, pathToGTF = "/fast/AG_Metzger/philipp/results/stringtie/HD/stringtie_merged_HD.gtf") Step 1 of 2: importing GTF (this may take a while)... Error in getSeqlevelsReplacementMode(value, seqlevels(x)) : the supplied 'seqlevels' must be a character vector with no NAs and no duplicates
That is really strange. But I'm afraid I cannot debug it without the data.
Would you be willing to send me the data so I can try it? I would naturally keep it confidential and delete it once I was done with the debugging.
If so you can save the aSwitchList
R object with the save()
function and email that, along with the (compressed) GTF file, to my email (which you can find here.
Thanks so much and sorry for the late response. I had to check wheter if its ok to send you the data. It is and I am currently uploading. Thank you in advance already and please keep the data confidential :)
Could you post the output of running sessionInfo()
`R version 4.1.2 (2021-11-01) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)
Matrix products: default BLAS/LAPACK: /gnu/store/bs9pl1f805ins80xaf4s3n35a0x2lyq3-openblas-0.3.9/lib/libopenblasp-r0.3.9.so
locale: [1] C
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] tximport_1.22.0 BSgenome.Hsapiens.UCSC.hg38_1.4.1 BSgenome_1.62.0 rtracklayer_1.54.0
[5] Biostrings_2.62.0 XVector_0.34.0 dplyr_1.0.7 magrittr_2.0.1
[9] IsoformSwitchAnalyzeR_1.16.0 ggplot2_3.3.5 DEXSeq_1.40.0 RColorBrewer_1.1-2
[13] AnnotationDbi_1.56.2 DESeq2_1.34.0 SummarizedExperiment_1.24.0 GenomicRanges_1.46.1
[17] GenomeInfoDb_1.30.0 IRanges_2.28.0 S4Vectors_0.32.3 MatrixGenerics_1.6.0
[21] matrixStats_0.61.0 Biobase_2.54.0 BiocGenerics_0.40.0 BiocParallel_1.28.1
[25] limma_3.50.0
loaded via a namespace (and not attached):
[1] colorspace_2.0-2 rjson_0.2.20 hwriter_1.3.2 ellipsis_0.3.2 futile.logger_1.4.3
[6] bit64_4.0.5 interactiveDisplayBase_1.32.0 fansi_0.5.0 xml2_1.3.2 splines_4.1.2
[11] cachem_1.0.6 geneplotter_1.72.0 jsonlite_1.7.2 Rsamtools_2.10.0 annotate_1.72.0
[16] dbplyr_2.1.1 png_0.1-7 shiny_1.7.1 readr_2.1.0 BiocManager_1.30.16
[21] compiler_4.1.2 httr_1.4.2 lazyeval_0.2.2 assertthat_0.2.1 Matrix_1.3-4
[26] fastmap_1.1.0 cli_3.1.0 later_1.3.0 formatR_1.11 htmltools_0.5.2
[31] prettyunits_1.1.1 tools_4.1.2 gtable_0.3.0 glue_1.5.0 GenomeInfoDbData_1.2.0
[36] reshape2_1.4.4 rappdirs_0.3.3 Rcpp_1.0.7 DRIMSeq_1.22.0 vctrs_0.3.8
[41] stringr_1.4.0 mime_0.12 lifecycle_1.0.1 ensembldb_2.18.2 restfulr_0.0.13
[46] statmod_1.4.36 XML_3.99-0.8 edgeR_3.36.0 AnnotationHub_3.2.0 scales_1.1.1
[51] vroom_1.5.6 ProtGenerics_1.26.0 hms_1.1.1 promises_1.2.0.1 parallel_4.1.2
[56] AnnotationFilter_1.18.0 lambda.r_1.2.4 yaml_2.2.1 curl_4.3.2 memoise_2.0.0
[61] gridExtra_2.3 biomaRt_2.50.1 stringi_1.7.5 RSQLite_2.2.8 BiocVersion_3.14.0
[66] genefilter_1.76.0 BiocIO_1.4.0 GenomicFeatures_1.46.1 filelock_1.0.2 rlang_0.4.12
[71] pkgconfig_2.0.3 bitops_1.0-7 lattice_0.20-45 purrr_0.3.4 GenomicAlignments_1.30.0
[76] bit_4.0.4 tidyselect_1.1.1 plyr_1.8.6 R6_2.5.1 generics_0.1.1
[81] DelayedArray_0.20.0 DBI_1.1.1 pillar_1.6.4 withr_2.4.2 survival_3.2-13
[86] KEGGREST_1.34.0 RCurl_1.95-0.1.2 tibble_3.1.6 tximeta_1.12.3 crayon_1.4.2
[91] futile.options_1.0.1 utf8_1.2.2 BiocFileCache_2.2.0 tzdb_0.2.0 progress_1.2.2
[96] locfit_1.5-9.4 grid_4.1.2 blob_1.2.2 digest_0.6.28 xtable_1.8-4
[101] VennDiagram_1.7.0 httpuv_1.6.3 munsell_0.5.0 `
Thanks! I think I found the problem.
You need the latest version from the development branch. I've just pushed 1.17.03 to BioC but it will probably be a day or two before it goes through. In the meantime you can download the updates from GitHub via this R command:
if (!requireNamespace("devtools", quietly = TRUE)){
install.packages("devtools")
}
devtools::install_github("kvittingseerup/IsoformSwitchAnalyzeR", build_vignettes = TRUE)
After installation restart R and verify update with this function packageVersion('IsoformSwitchAnalyzeR')
and try again.
Let me know if that works :-).
Cheers Kristoffer
I'm working on the cluster from our institute, so I will have to wait until the people responsible can update the package on there. However when trying to download it on my local RStudio I got:
Fehler: Failed to install 'IsoformSwitchAnalyzeR' from GitHub: System command 'R' failed, exit status: 1, stdout + stderr (last 10 lines): E> with piece 1: E> isoformUpregulated isoformDownregulated comparison E> 1 TCONS_00000234 TCONS_00000235 1 E> --- failed re-building ‘IsoformSwitchAnalyzeR.Rmd’ E> E> SUMMARY: processing the following file failed: E> ‘IsoformSwitchAnalyzeR.Rmd’ E> E> Fehler: Vignette re-building failed.
Oh - I was to fast. Try updating to 1.17.04 (by using the same command).
ok the installation worked now but I got this error when running part1:
Step 1 of 3 : Detecting isoform switches... Step 2 of 3 : Adding and predicting open reading frames Fehler in addORFfromGTF(switchAnalyzeRlist = switchAnalyzeRlist, pathToGTF = pathToGTF, : No ORFs could be added to the switchAnalyzeRlist. Please ensure GTF file have CDS info (and that isoform ids match).
Before answering can I ask what you get out of that error message? (Only asking because I'm continuously trying to improve them).
I am not quite sure but I suppose there might be something wrong with my GTF file? so maybe a setting I should have set while generating it with stringtie merge?
Thanks for the feedback. The problem is that you should use the official GTF that you downloaded from an official source such as Ensembl, Gencode, etc. The one you also used to guide stringtie.
thank you! now I am getting an error during step 3. I used the same aSwitchlist object which I also sent to you for debugging!
Step 1 of 3 : Detecting isoform switches... Step 2 of 3 : Adding and predicting open reading frames Step 3 of 3 : Extracting (and outputting) sequences Fehler in extractSequence(switchAnalyzeRlist = switchAnalyzeRlist, genomeObject = genomeObject, : The object supplied to 'switchAnalyzeRlist' must be a 'switchAnalyzeRlist' Zusätzlich: Warnmeldungen: 1: In DESeqDataSet(rse, design, ignoreRank = TRUE) : some variables in design formula are characters, converting to factors 2: In addORFfromGTF(switchAnalyzeRlist = switchAnalyzeRlist, pathToGTF = pathToGTF, : It seems ORF was only added to an unlikely small fraction of isoforms (less than 10%).If you are not sure this is on purpose something went wrong and the files are not matching.
Could you post this as a new issue to keep the errors separate?
Also when you do please inspect the "mySwitchList" you get from running the individual steps of the isoformSwitchAnalysisPart1()
:
### Filter
mySwitchList <- preFilter( mySwitchList )
### Test for isoform switches
mySwitchList <- isoformSwitchTestDEXSeq( mySwitchList )
### Add ORF
mySwitchList <- addORFfromGTF( mySwitchList )
mySwitchList <- analyzeNovelIsoformORF( mySwitchList )
### Add and write bio sequences
mySwitchList <- extractSequence()
Hi,
I tried running the quick analysis and ran into this Error during isoformSwitchAnalysisPart1.
Step 1 of 3 : Detecting isoform switches... Step 2 of 3 : Adding and predicting open reading frames Error in getSeqlevelsReplacementMode(value, seqlevels(x)) : the supplied 'seqlevels' must be a character vector with no NAs and no duplicates
I used Stringtie to assemble transcripts and used the merge function to generate a new GTF File.
Here's my code:
library(IsoformSwitchAnalyzeR) library(BSgenome.Hsapiens.UCSC.hg38)
Stringtie_quant <- importIsoformExpression(parentDir = "/fast/AG_Metzger/philipp/results/ballgown/HD/", readLength = 76) pheno_data <- read.csv("/fast/AG_Metzger/philipp/results/ballgown/HD/phenodata_HD.csv") row.names(pheno_data) <- pheno_data$Filename names(pheno_data)[2] <- "sampleID" names(pheno_data)[4] <- "condition" pheno_data <- pheno_data[2:7] pheno_data <- pheno_data[-4] pheno_data <- pheno_data[-4] pheno_data <- pheno_data[-4] pheno_data <- pheno_data[-2]#have only condition and Patient.ID and sampleIDs in design Matrix
aSwitchList <- importRdata( isoformCountMatrix = Stringtie_quant$counts, isoformRepExpression = Stringtie_quant$abundance, designMatrix = pheno_data, isoformExonAnnoation = "/fast/AG_Metzger/philipp/results/stringtie/HD/stringtie_merged_HD.gtf", fixStringTieAnnotationProblem = TRUE, showProgress = TRUE )
genome <- BSgenome.Hsapiens.UCSC.hg38
exampleSwitchList <- isoformSwitchAnalysisPart1( pathToGTF = "/fast/AG_Metzger/philipp/results/stringtie/HD/stringtie_merged_HD.gtf", genomeObject = genome, switchAnalyzeRlist = aSwitchList, outputSequences = FALSE, prepareForWebServers = FALSE
)
I excuse any simple mistakes, since I'm new to Bio Informatics!
Thanks for your help.