ctlab / LinSeed

Linseed: LINear Subspace identification for gene Expresion Deconvolution
MIT License
28 stars 8 forks source link

Create LinseedObject #1

Open meichendong opened 5 years ago

meichendong commented 5 years ago

Hi I was wondering if you can give more details about how to create the LinseedObject? While I was trying to use the public data GSE50244, I ended up with error like: > lo <- LinseedObject$new("GSE50244", samples=1:10, topGenes=10000) Found 1 file(s) GSE50244_series_matrix.txt.gz Using locally cached version: C:\Users\meichen\AppData\Local\Temp\RtmpiWoz2m/GSE50244_series_matrix.txt.gz Parsed with column specification: cols( .default = col_character() ) See spec(...) for full column specifications. Annotation GPL not available, so will use submitter GPL instead Using locally cached version of GPL11154 found here: C:\Users\meichen\AppData\Local\Temp\RtmpiWoz2m/GPL11154.soft Error in if (qx[5] - qx[1] > 100 || qx[5] > 100) { : missing value where TRUE/FALSE needed In addition: Warning message: In download.file(myurl, destfile, mode = mode, quiet = TRUE, method = getOption("download.file.method.GEOquery")) : cannot open URL 'https://ftp.ncbi.nlm.nih.gov/geo/platforms/GPL11nnn/GPL11154/annot/GPL11154.annot.gz': HTTP status was '404 Not Found'

I was wondering is there another way to create an object? Like simply using the expression matrix or an expressionSet object?

Thanks!

konsolerr commented 5 years ago

Hi meichendong,

At the moment,this $new() constructor can take matrix and data.frame as an input (as a first argument, instead of character GEO identifier)

Basically lo <- LinseedObject$new(matrixObject, samples=1:10, topGenes=10000)

should work just fine.

But if you pass the dataset by yourself, make sure your data is normalized (quantile for microarrays and something like RPKMs for RNA-seq data).

I won't close this issue until I write better documentation for the initialization method.

Cheers, Konstantin

methornton commented 5 years ago

oops. I think this is the question I asked. I will try to make a linseed object with a matrix of the RPKM for the set. The original idea was to collect differential expression with a relative comparison model. so we have treatment samples and control samples, both heterogeneous. They were prepared the same way, so they should have a similar mixture of cells. However, there could be something about the treatment which could cause cells to differentiate. I was thinking of doing each separately and then potentially comparing the deconvoluted gene expression from similar unmixed populations. A tutorial for this would be very helpful.

shashj commented 4 years ago

Can we also use DESeq normalized data for this?

Rajmandage22 commented 2 years ago

Hi,

I am trying to work on the deconvolution of expression data and found this package very helpful. However, when l tried to specify GEO accession number, l encountered an error.

Found 1 file(s) GSE14641_series_matrix.txt.gz Using locally cached version: /var/folders/3t/9lv4qds57kvg31f9zsxdr_zm0000gn/T//RtmpN2ajN6/GSE14641_series_matrix.txt.gz Annotation GPL not available, so will use submitter GPL instead Using locally cached version of GPL8132 found here: /var/folders/3t/9lv4qds57kvg31f9zsxdr_zm0000gn/T//RtmpN2ajN6/GPL8132.soft.gz Error in [.data.frame(annotation, , geneSymbol, drop = FALSE) : undefined columns selected

It seems this error is associated with the input file. As per the previous discussion with #1 l should specify object file rather than GEO accession number.

Is there any manual available on how to create an input object file using the GEO database, any tool recommendation?

Raj