Open hachepunto opened 9 years ago
Hay un API de python para usar el omnibus: http://pythonhosted.org/Orange-Bioinformatics/reference/geo.html
# Version info: R 2.14.1, Biobase 2.15.3, GEOquery 2.23.2, limma 3.10.1
# R scripts generated Wed Jun 3 12:01:47 EDT 2015
# Unable to generate script analyzing differential expression.
# Invalid input: at least two groups of samples should be selected.
################################################################
# Boxplot for selected GEO samples
library(Biobase)
library(GEOquery)
# load series and platform data from GEO
gset <- getGEO("GSE53394", GSEMatrix =TRUE)
if (length(gset) > 1) idx <- grep("GPL96", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]
# set parameters and draw the plot
dev.new(width=4+dim(gset)[[2]]/5, height=6)
par(mar=c(2+round(max(nchar(sampleNames(gset)))/2),4,2,1))
title <- paste ("GSE53394", '/', annotation(gset), " selected samples", sep ='')
boxplot(exprs(gset), boxwex=0.7, notch=T, main=title, outline=FALSE, las=2)
legend("topleft", labels, fill=palette(), bty="n")
Escribir el programa para autodowloader los CEL files (GSM) o bien los conjuntos de CEL files (GSE)
Sería conveniente que checara repetidos (usar checksum)
Comando que podría servir para bajar los GSM:
wget --ignore-case ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM272nnn/GSM272727/suppl/GSM272727*.CEL.gz