RamsinghLab / arkas

This is the kallisto package
0 stars 0 forks source link

SummarizedExperiment Inheritance issue #45

Closed arcolombo closed 7 years ago

arcolombo commented 7 years ago

The S4 structure is gathering a list and not assembling a proper s4 experiment. works on my linux, not on mac. maybe class conflict with SummarizedExperiment, or GenomicRanges.

n object of class "KallistoExperiment" Slot "transcriptomes": [1] "ErccDbLite.ERCC.97, EnsDbLite.Hsapiens.81, RepDbLite.Hsapiens.2007"

Slot "kallistoVersion": [1] "0.42.3"

Slot "rowRanges": GRanges object with 213782 ranges and 9 metadata columns: seqnames ranges strand | tx_length gc_content tx_id

| ERCC-00002 Unknown [1, 1] * | 1 ERCC-00003 Unknown [1, 1] * | 1 ERCC-00004 Unknown [1, 1] * | 1 ERCC-00007 Unknown [1, 1] * | 1 ERCC-00009 Unknown [1, 1] * | 1 ... ... ... ... . ... ... ... SVA_E Unknown [1, 1] * | 1 SVA_F Unknown [1, 1] * | 1 AluYb11 Unknown [1, 1] * | 1 AluYb10 Unknown [1, 1] * | 1 AluYb8a1 Unknown [1, 1] * | 1 gene_id gene_name entrezid tx_biotype gene_biotype ERCC-00002 ERCC-00003 ERCC-00004 ERCC-00007 ERCC-00009 ... ... ... ... ... ... SVA_E SVA_F AluYb11 AluYb10 AluYb8a1 biotype_class ERCC-00002 ERCC-00003 ERCC-00004 ERCC-00007 ERCC-00009 ... ... SVA_E SVA_F AluYb11 AluYb10 AluYb8a1 ------- seqinfo: 1 sequence from an unspecified genome Slot "colData": DataFrame with 6 rows and 1 column ID n1 n1 n2 n2 n4 n4 s1 s1 s2 s2 s4 s4 Slot "assays": Reference class object of class "ShallowSimpleListAssays" Field "data": List of length 3 names(3): est_counts eff_length tpm Slot "NAMES": NULL Slot "elementMetadata": DataFrame with 213782 rows and 0 columns Slot "metadata":
arcolombo commented 7 years ago

this issue is on a mac. there is an issue with the KallistoExperiment constructor

on my system did not have the error sessionInfo() R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS

Matrix products: default BLAS: /usr/lib/libblas/libblas.so.3.0 LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] SummarizedExperiment_1.0.2 Biobase_2.30.0
[3] GenomicRanges_1.22.4 GenomeInfoDb_1.6.3
[5] IRanges_2.4.8 S4Vectors_0.8.11
[7] BiocGenerics_0.16.1 BiocInstaller_1.20.3

loaded via a namespace (and not attached): [1] zlibbioc_1.16.0 compiler_3.4.0 XVector_0.10.0 tools_3.4.0

arcolombo commented 7 years ago

arkas is not compatible with SummarizedExperiment 1.6.1 (bioCLite version 3.5)

arcolombo commented 7 years ago

so this is a bit odd.

first, the issue is with the new release SummarizedExperiment (1.6.1), Bioconductor release inheritance with the arkas class "KallistoExperiment".

the problem is that when you construct a "KallistoExperiment" using the SummarizedExperiment class, adding the kallisto version, and transcriptomes, it returns a list of all the slots, and the S4 object is not properly assembled.

If I change the class name from "KallistoExperiment" to "KExperiment", with the exact same constructor (see below)

setClass("KExperiment",representation(transcriptomes="character",kallistoVersion="character"),contains="RangedSummarizedExperiment")

new("KExperiment",Y,kallistoVersion="50",transcriptomes="MM") class: KExperiment dim: 213782 6 metadata(0): assays(3): est_counts eff_length tpm rownames(213782): ERCC-00002 ERCC-00003 ... AluYb10 AluYb8a1 rowData names(9): tx_length gc_content ... gene_biotype biotype_class colnames(6): n1 n2 ... s2 s4 colData names(1): ID

new("KExperiment",Y,kallistoVersion="50",transcriptomes="MM")->p Browse[2]> head(str(p)) Formal class 'KExperiment' [package ".GlobalEnv"] with 8 slots ..@ transcriptomes : chr "MM" ..@ kallistoVersion: chr "50" ..@ rowRanges :Formal class 'GRanges' [package "GenomicRanges"] with 6 slots .. .. ..@ seqnames :Formal class 'Rl

here we see the two character slots from the "KExperiment" definition. note that the object Y in the above creation is the SummarizedExperiment of the Normal Senescent arkasData.

is "KallistoExperiment" a reserved word for BioConductor Devel?

ttriche commented 7 years ago

Looking into this. I thought so at first, but a quick grep through SummarizedExperiment suggests otherwise. Hmm.

--t

On Wed, May 24, 2017 at 8:55 AM, Anthony R. Colombo < notifications@github.com> wrote:

so this is a bit odd.

first, the issue is with the new release SummarizedExperiment (1.6.1), Bioconductor release inheritance with the arkas class "KallistoExperiment".

the problem is that when you construct a "KallistoExperiment" using the SummarizedExperiment class, adding the kallisto version, and transcriptomes, it returns a list of all the slots, and the S4 object is not properly assembled.

If I change the class name from "KallistoExperiment" to "KExperiment", with the exact same constructor (see below)

setClass("KExperiment",representation(transcriptomes= "character",kallistoVersion="character"),contains=" RangedSummarizedExperiment")

new("KExperiment",Y,kallistoVersion="50",transcriptomes="MM") class: KExperiment dim: 213782 6 metadata(0): assays(3): est_counts eff_length tpm rownames(213782): ERCC-00002 ERCC-00003 ... AluYb10 AluYb8a1 rowData names(9): tx_length gc_content ... gene_biotype biotype_class colnames(6): n1 n2 ... s2 s4 colData names(1): ID

new("KExperiment",Y,kallistoVersion="50",transcriptomes="MM")->p Browse[2]> head(str(p)) Formal class 'KExperiment' [package ".GlobalEnv"] with 8 slots ..@ transcriptomes : chr "MM" ..@ kallistoVersion: chr "50" ..@ rowRanges :Formal class 'GRanges' [package "GenomicRanges"] with 6 slots .. .. ..@ seqnames :Formal class 'Rl

here we see the two character slots from the "KExperiment" definition. note that the object Y in the above creation is the SummarizedExperiment of the Normal Senescent arkasData.

is "KallistoExperiment" a reserved word for BioConductor Devel?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/RamsinghLab/arkas/issues/45#issuecomment-303768468, or mute the thread https://github.com/notifications/unsubscribe-auth/AAARIp8_dtYmbsTWO0_p9Odj7btIOXiLks5r9FLzgaJpZM4NkZqI .

arcolombo commented 7 years ago

OK. I'm dropping the alias to see if that is an issue.

arcolombo commented 7 years ago

not any naming specific issue.

arcolombo commented 7 years ago

the problem lies within the Kallisto-methods.R

for instance, if I remove the methods, and install arkas without the Kallisto-methods.R the kexp is merged correctly.

suppressPackageStartupMessages(library(TxDbLite)) samples<-c("n1","n2","n4","s1","s2","s4") pathBase<-system.file("extdata",package="arkasData") merged <- mergeKallisto(samples, outputPath=pathBase) Bootstraps ignored for /Library/Frameworks/R.framework/Versions/3.4/Resources/library/arkasData/extdata/n1, set summarize=TRUE to use Bootstraps ignored for /Library/Frameworks/R.framework/Versions/3.4/Resources/library/arkasData/extdata/n2, set summarize=TRUE to use Bootstraps ignored for /Library/Frameworks/R.framework/Versions/3.4/Resources/library/arkasData/extdata/n4, set summarize=TRUE to use Bootstraps ignored for /Library/Frameworks/R.framework/Versions/3.4/Resources/library/arkasData/extdata/s1, set summarize=TRUE to use Bootstraps ignored for /Library/Frameworks/R.framework/Versions/3.4/Resources/library/arkasData/extdata/s2, set summarize=TRUE to use Bootstraps ignored for /Library/Frameworks/R.framework/Versions/3.4/Resources/library/arkasData/extdata/s4, set summarize=TRUE to use Setting transcriptome automatically from Kallisto call string. merged class: KallistoExperiment dim: 213782 6 metadata(0): assays(3): est_counts eff_length tpm rownames(213782): ERCC-00002 ERCC-00003 ... AluYb10 AluYb8a1 rowData names(9): tx_length gc_content ... gene_biotype biotype_class colnames(6): n1 n2 ... s2 s4 colData names(1): ID

this implies that the problem is not with the class, constructor, or the merge command. but some method. my first guess is to drop the alias commands. updates soon.

arcolombo commented 7 years ago

so the conflict has to do with this method

' @export

setAs("KallistoExperiment", "SummarizedExperiment", function(from) { metanames <- names(from@metadata) metaorder <- c("transcriptomes", "kallistoVersion", metanames) from@metadata$transcriptomes <- from@transcriptomes from@metadata$kallistoVersion <- from@kallistoVersion from@metadata <- from@metadata[metaorder] if (!identical(colnames(from@assays$data[[1]]),rownames(from@colData))){ for (i in names(from@assays$data)) { colnames(from@assays$data[[i]]) <- rownames(from@colData) } } SummarizedExperiment(assays=from@assays$data, rowRanges=from@rowRanges, colData=from@colData, metadata=from@metadata) })

how do I know this? I deleted the methods and added each method individually until the error popped up, and lo-and -behold, the merging broke upon the addition of the first setAs.

should we remove this or patch?

arcolombo commented 7 years ago

okay so the fix is to just remove this method from Kallisto-methods.R. perhaps one work around would be to write an R script to invoke it directly and simply move this method out of Kallisto-method.R and move this into the R directory. there is some implicit issue here, which is a mystery to me on why this error happens. my suggestion would be to create a function KEtoSE.R and copy this there.

arcolombo commented 7 years ago

this issue is resolved per my branch. PR forthcoming. created funcitons SEtoKE etc, and upgraded my end... worked... the vignettes built.... closing...