seandavi / GEOquery

The bridge between the NCBI Gene Expression Omnibus and Bioconductor
http://seandavi.github.io/GEOquery/
Other
88 stars 36 forks source link

GEOquery returning NAs as probe names #124

Closed garciaaxnih closed 2 years ago

garciaaxnih commented 2 years ago

I am re-running old code to obtain microarray datasets with GEOquery. However, after not running the same code for a month or so, I am getting a new error. I am now getting NAs as probe names for datasets using getGEO(). It seems that datasets obtained with platform GPL4372 are having issues but platform GPL2700 is not.

It was working previously when GEOquery was still using readr. After the new update is when I am getting the NAs issue. Just re-installed packageVersion 2.46.14 and it was able to reload the probe names and gene names.

#This is returning NAs
gset <- getGEO("GSE33000", GSEMatrix =TRUE, AnnotGPL=TRUE)
if (length(gset) > 1) idx <- grep("GPL4372", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]
fData(gset) #returning NAs

fData(gset) ID Gene title Gene symbol Gene ID UniGene title UniGene symbol UniGene ID Nucleotide Title GI GenBank Accession Platform_CLONEID Platform_ORF Platform_SPOTID NA NA NA NA.1 NA NA NA.2 NA NA

#This works fine
gset <- getGEO("GSE15222", GSEMatrix =TRUE, AnnotGPL=TRUE)
if (length(gset) > 1) idx <- grep("GPL2700", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]
fData(gset) #This returns annotation

fData(gset) ID Gene title Gene symbol Gene ID UniGene title UniGene symbol UniGene ID GI_10047089-S GI_10047089-S small muscle protein, X-linked SMPX 23676
GI_10047091-S GI_10047091-S transgelin 3 TAGLN3 29114
GI_10047093-S GI_10047093-S heat shock protein family A (Hsp70) member 14 HSPA14 51182

sessionInfo( ) R version 4.1.2 (2021-11-01) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 11.6.2

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] readr_2.1.0 GEOquery_2.62.1 Biobase_2.54.0 BiocGenerics_0.40.0 EnhancedVolcano_1.12.0 ggrepel_0.9.1 ggplot2_3.3.5

loaded via a namespace (and not attached): [1] beeswarm_0.4.0 tidyselect_1.1.1 purrr_0.3.4 ggrastr_1.0.1 colorspace_2.0-2 vctrs_0.3.8 generics_0.1.1 utf8_1.2.2
[9] rlang_0.4.12 R.oo_1.24.0 pillar_1.6.4 glue_1.5.0 withr_2.4.2 DBI_1.1.1 R.utils_2.11.0 bit64_4.0.5
[17] ggalt_0.4.0 RColorBrewer_1.1-2 lifecycle_1.0.1 cellranger_1.1.0 munsell_0.5.0 gtable_0.3.0 R.methodsS3_1.8.1 tzdb_0.2.0
[25] extrafont_0.17 vipor_0.4.5 curl_4.3.2 fansi_0.5.0 Rttf2pt1_1.3.9 Rcpp_1.0.7 KernSmooth_2.23-20 scales_1.1.1
[33] BiocManager_1.30.16 limma_3.50.0 bit_4.0.4 proj4_1.0-10.1 hms_1.1.1 dplyr_1.0.7 ash_1.0-15 grid_4.1.2
[41] tools_4.1.2 magrittr_2.0.1 maps_3.4.0 tibble_3.1.6 crayon_1.4.2 extrafontdb_1.0 tidyr_1.1.4 pkgconfig_2.0.3
[49] MASS_7.3-54 ellipsis_0.3.2 data.table_1.14.2 xml2_1.3.2 ggbeeswarm_0.6.0 rstudioapi_0.13 assertthat_0.2.1 R6_2.5.1
[57] compiler_4.1.2

seandavi commented 2 years ago

Thanks for the bug report, @garciaaxnih. This should be available in Bioc 3.14 and Bioc devel in the next 48 hours. In the meantime, you can install from github if you want a quicker fix.