lgatto / pRolocGUI

Interactive visualisation and exploration of spatial proteomics data
http://lgatto.github.io/pRolocGUI
7 stars 5 forks source link

cannot retrieve auxiliary data from biomart #73

Closed blue-moon22 closed 8 years ago

blue-moon22 commented 8 years ago
> library("MSnbase")
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap,
    parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, as.vector, cbind, colnames, do.call,
    duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted,
    lapply, lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unlist, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite
    Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: mzR
Loading required package: Rcpp
Loading required package: BiocParallel
Loading required package: ProtGenerics

This is MSnbase version 1.18.1 
  Read '?MSnbase' and references therein for information
  about the package and how to get started.

Attaching package: ‘MSnbase’

The following object is masked from ‘package:stats’:

    smooth

> library("pRoloc")
Loading required package: MLInterfaces
Loading required package: annotate
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors
Loading required package: XML
Loading required package: cluster

This is MSnbase version 1.10.1 
  Read '?pRoloc' and references therein for information
  about the package and how to get started.

Warning messages:
1: replacing previous import ‘ggplot2::Position’ by ‘BiocGenerics::Position’ when loading ‘pRoloc’ 
2: replacing previous import ‘ggplot2::alpha’ by ‘kernlab::alpha’ when loading ‘pRoloc’ 
> library("pRolocdata")

This is pRolocdata version 1.8.0.
Use 'pRolocdata()' to list available data sets.
> data("dunkley2006params")
> data("dunkley2006")
> setAnnotationParams(dunkley2006params)
> dunkleygoset <- makeGoSet(dunkley2006, dunkley2006params)
**Error in getBM(attributes = c(params@filter, attrs, "go_linkage_type"),  : 
  Query ERROR: caught BioMart::Exception::Usage: WITHIN Virtual Schema : plants_mart_30, Dataset athaliana_eg_gene NOT FOUND**
> sessionInfo()
R version 3.2.4 (2016-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pRolocdata_1.8.0     pRoloc_1.10.1        MLInterfaces_1.50.0  cluster_2.0.3       
 [5] annotate_1.48.0      XML_3.98-1.4         AnnotationDbi_1.32.3 IRanges_2.4.8       
 [9] S4Vectors_0.8.11     MSnbase_1.18.1       ProtGenerics_1.2.1   BiocParallel_1.4.3  
[13] mzR_2.4.1            Rcpp_0.12.3          Biobase_2.30.0       BiocGenerics_0.16.1 

loaded via a namespace (and not attached):
 [1] nlme_3.1-126          pbkrtest_0.4-6        bitops_1.0-6          doParallel_1.0.10    
 [5] RColorBrewer_1.1-2    threejs_0.2.1         prabclus_2.2-6        ggvis_0.4.2          
 [9] tools_3.2.4           R6_2.1.2              affyio_1.40.0         rpart_4.1-10         
[13] mgcv_1.8-12           DBI_0.3.1             colorspace_1.2-6      trimcluster_0.1-2    
[17] nnet_7.3-12           gbm_2.1.1             preprocessCore_1.32.0 quantreg_5.21        
[21] SparseM_1.7           diptest_0.75-7        scales_0.4.0          sfsmisc_1.1-0        
[25] DEoptimR_1.0-4        mvtnorm_1.0-5         robustbase_0.92-5     randomForest_4.6-12  
[29] genefilter_1.52.1     affy_1.48.0           proxy_0.4-15          stringr_1.0.0        
[33] digest_0.6.9          minqa_1.2.4           base64enc_0.1-3       htmltools_0.3        
[37] lme4_1.1-11           rda_1.0.2-2           limma_3.26.8          htmlwidgets_0.6      
[41] RSQLite_1.0.0         impute_1.44.0         BiocInstaller_1.20.1  FNN_1.1              
[45] shiny_0.13.1          hwriter_1.3.2         mzID_1.8.0            mclust_5.1           
[49] gtools_3.5.0          car_2.1-1             dplyr_0.4.3           RCurl_1.95-4.8       
[53] magrittr_1.5          modeltools_0.2-21     Matrix_1.2-4          futile.logger_1.4.1  
[57] MALDIquant_1.14       munsell_0.4.3         vsn_3.38.0            stringi_1.0-1        
[61] MASS_7.3-45           zlibbioc_1.16.0       flexmix_2.3-13        plyr_1.8.3           
[65] grid_3.2.4            pls_2.5-0             gdata_2.17.0          lattice_0.20-33      
[69] splines_3.2.4         knitr_1.12.3          fpc_2.1-10            lpSolve_5.6.13       
[73] reshape2_1.4.1        codetools_0.2-14      biomaRt_2.26.1        futile.options_1.0.0 
[77] pcaMethods_1.60.0     lambda.r_1.1.7        mlbench_2.1-1         nloptr_1.0.4         
[81] httpuv_1.3.3          foreach_1.4.3         MatrixModels_0.4-1    gtable_0.2.0         
[85] kernlab_0.9-23        assertthat_0.1        ggplot2_2.1.0         mime_0.4             
[89] xtable_1.8-2          e1071_1.6-7           class_7.3-14          survival_2.38-3      
[93] iterators_1.0.8       rgl_0.95.1441         caret_6.0-64          sampling_2.7   
lgatto commented 8 years ago

Thanks for the report. I will need some time to investigate - hopefully tonight.

lgatto commented 8 years ago

You need to update your annotation parameters:

> p <- setAnnotationParams(inputs = c("Arabidopsis thaliana", "TAIR locus model"))
Using species Arabidopsis thaliana genes (TAIR10 (2010-09-TAIR10))
Using feature type TAIR locus model ID(s)
Connecting to Biomart...
> makeGoSet(dunkley2006, p)
MSnSet (storageMode: lockedEnvironment)
assayData: 689 features, 0 samples 
  element names: exprs 
protocolData: none
phenoData: none
featureData
  featureNames: AT1G09210 AT1G21750 ... AT4G39080 (689 total)
  fvarLabels: assigned evidence ... markers (8 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  
- - - Processing information - - -
Constructed GO set using cellular_component namespace: Sun Mar 27 19:22:57 2016 
 MSnbase version: 1.19.16 

And by the way, you don't need to set the parameters if you pass them to the function. Either do

makeGoSet(dunkley2006, p)

or

setAnnotationParams(p)
dunkleygoset <- makeGoSet(dunkley2006)

I'm closing this issue now - feel free to reopen it if you still have errors.

blue-moon22 commented 8 years ago

Thanks! Though I now get an error with knntlOptimization

> p <- setAnnotationParams(inputs = c("Arabidopsis thaliana", "TAIR locus model"))
Using species Arabidopsis thaliana genes (TAIR10 (2010-09-TAIR10))
Using feature type TAIR locus model ID(s)
Connecting to Biomart...
> dunkleygoset <- makeGoSet(dunkley2006, p)
> m <- unique(fData(dunkley2006)$markers)
> m <- m[m != "unknown"]
> th <- thetas(length(m), length.out=4)
Weigths:
  (0, 0.333333333333333, 0.666666666666667, 1)
> set.seed(1)
> i <- sample(nrow(th), 12)
> topt <- knntlOptimisation(dunkley2006, dunkleygoset, th = th[i,], k = c(3,3), fcol = "markers", times = 5)
Note: vector will be ordered according to classes: ER lumen ER membrane Golgi Mitochondrion Plastid PM Ribosome TGN vacuole (as names are not explicitly defined)
**Error in unserialize(socklist[[n]]) : error reading from connection
Error: failed to stop ‘SOCKcluster’ cluster: error writing to connection**
> sessionInfo()
R version 3.2.4 (2016-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pRolocdata_1.8.0     pRoloc_1.10.1        MLInterfaces_1.50.0  cluster_2.0.3        annotate_1.48.0     
 [6] XML_3.98-1.4         AnnotationDbi_1.32.3 IRanges_2.4.8        S4Vectors_0.8.11     MSnbase_1.18.1      
[11] ProtGenerics_1.2.1   BiocParallel_1.4.3   mzR_2.4.1            Rcpp_0.12.3          Biobase_2.30.0      
[16] BiocGenerics_0.16.1 

loaded via a namespace (and not attached):
 [1] nlme_3.1-126          pbkrtest_0.4-6        bitops_1.0-6          doParallel_1.0.10    
 [5] RColorBrewer_1.1-2    threejs_0.2.1         prabclus_2.2-6        ggvis_0.4.2          
 [9] tools_3.2.4           R6_2.1.2              affyio_1.40.0         rpart_4.1-10         
[13] mgcv_1.8-12           DBI_0.3.1             colorspace_1.2-6      trimcluster_0.1-2    
[17] nnet_7.3-12           gbm_2.1.1             preprocessCore_1.32.0 quantreg_5.21        
[21] SparseM_1.7           diptest_0.75-7        scales_0.4.0          sfsmisc_1.1-0        
[25] DEoptimR_1.0-4        mvtnorm_1.0-5         robustbase_0.92-5     randomForest_4.6-12  
[29] genefilter_1.52.1     affy_1.48.0           proxy_0.4-15          stringr_1.0.0        
[33] digest_0.6.9          minqa_1.2.4           base64enc_0.1-3       htmltools_0.3        
[37] lme4_1.1-11           rda_1.0.2-2           limma_3.26.8          htmlwidgets_0.6      
[41] RSQLite_1.0.0         impute_1.44.0         BiocInstaller_1.20.1  FNN_1.1              
[45] shiny_0.13.1          hwriter_1.3.2         mzID_1.8.0            mclust_5.1           
[49] gtools_3.5.0          car_2.1-1             dplyr_0.4.3           RCurl_1.95-4.8       
[53] magrittr_1.5          modeltools_0.2-21     Matrix_1.2-4          futile.logger_1.4.1  
[57] MALDIquant_1.14       munsell_0.4.3         vsn_3.38.0            stringi_1.0-1        
[61] MASS_7.3-45           zlibbioc_1.16.0       flexmix_2.3-13        plyr_1.8.3           
[65] grid_3.2.4            pls_2.5-0             gdata_2.17.0          lattice_0.20-33      
[69] splines_3.2.4         knitr_1.12.3          fpc_2.1-10            lpSolve_5.6.13       
[73] reshape2_1.4.1        codetools_0.2-14      biomaRt_2.26.1        futile.options_1.0.0 
[77] pcaMethods_1.60.0     lambda.r_1.1.7        mlbench_2.1-1         nloptr_1.0.4         
[81] httpuv_1.3.3          foreach_1.4.3         MatrixModels_0.4-1    gtable_0.2.0         
[85] kernlab_0.9-23        assertthat_0.1        ggplot2_2.1.0         mime_0.4             
[89] xtable_1.8-2          e1071_1.6-7           class_7.3-14          survival_2.38-3      
[93] snow_0.4-1            iterators_1.0.8       rgl_0.95.1441         caret_6.0-64         
[97] sampling_2.7  
lgatto commented 8 years ago

That's a problem with parallel processing, which is Windows-specific and I won't be able to debug right now. You can do the following to proceed serially:

p <- SerialParam()
knntlOptimisation(dunkley2006, dunkleygoset, th = th[i,], k = c(3,3), fcol = "markers", times = 5)

You could also use molerat and parallelise over 16 cores by setting

 p <- MulticoreParam(16L)
lgatto commented 8 years ago

The parameters should actually be

> p <- setAnnotationParams(inputs = c("Arabidopsis thaliana", "TAIR locus ID"))
Using species Arabidopsis thaliana genes (TAIR10 (2010-09-TAIR10))
Using feature type TAIR locus ID(s)
Connecting to Biomart...
> p
Object of class "AnnotationParams"
 Using the 'plants_mart' BioMart database
 Using the 'athaliana_eg_gene' dataset
 Using 'tair_locus' as filter
 Created on Fri Apr  1 10:00:21 2016
> makeGoSet(dunkley2006[1:10, ], p)
MSnSet (storageMode: lockedEnvironment)
assayData: 10 features, 21 samples 
  element names: exprs 
protocolData: none
phenoData: none
featureData
  featureNames: AT1G09210 AT1G21750 ... AT1G07810 (10 total)
  fvarLabels: assigned evidence ... markers (8 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  
- - - Processing information - - -
Constructed GO set using cellular_component namespace: Fri Apr  1 10:00:25 2016 
 MSnbase version: 1.19.17