seandavi / GEOquery

The bridge between the NCBI Gene Expression Omnibus and Bioconductor
http://seandavi.github.io/GEOquery/
Other
87 stars 36 forks source link

The nature of the downloaded data (normalized vs nonnormalized?) #81

Closed atakanekiz closed 5 years ago

atakanekiz commented 5 years ago

Hello,

Thanks for this very helpful package. I have a question about the nature of the data being downloaded by the getGEO function. I've analyzed a few datasets and when I plot the expression values of all the genes per sample I usually see similar distributions suggesting the data were normalized as seen below (example from GSE65218 gene expression microarray):

image

But this isn't always true as seen here (example from protein mircoarray GSE25755): image

Median expression values are still not far away from each other which might indicate that some sort of normalization is applied. in the latter case as well. I just want to make sure that the data I download by using the getGEO function is always the normalized ready-to-analyze data. I'm using the default arguments to prepare the ExpressionSet object:

suppressPackageStartupMessages(library(GEOquery))

gse <- getGEO("GSE25755", destdir = getwd())

gse1 <- gse[[1]]

Thanks so much for the help.

Best, Atakan

seandavi commented 5 years ago

Hi, @atakanekiz. The data downloaded by GEOquery are taken directly from the GEO repository without manipulation. In general, the data at GEO are taken directly from the submitters and are often normalized, but there is no guarantee that is the case. Also, the normalization methods (if at all) will vary from study to study. Often, the data processing approaches are detailed at GEO and that is the easiest way to determine what has been done. In some cases, you may have to go back to the publication or even to email the authors.

Where raw data are available at GEO, you can also access those and process yourself if you want to thoroughly control the preprocessing.

Hope that helps.

atakanekiz commented 5 years ago

That's very helpful, thank you very much.

Best, Atakan