molgenis / molgenis-service-armadillo

Armadillo; a DataSHIELD implementation, part of the MOLGENIS suite
https://molgenis.github.io/molgenis-service-armadillo/
GNU Lesser General Public License v3.0
7 stars 10 forks source link

ds.levels returns warning related to arrow package, and does not return levels #679

Open timcadman opened 7 months ago

timcadman commented 7 months ago

Not sure at all this issue is in the correct place, but I wanted to document it somewhere given all the problems we've been having with arrow

How to reproduce

First log in to the CAS. Then:

url <- "https://armadillo-demo.molgenis.net/"
token <- armadillo.get_token(url)
builder <- DSI::newDSLoginBuilder()

builder$append(
  server = "barcelona",
  url = url,
  token = token,
  table = "armadillo-illustration/barcelona/pancreatic",
  driver = "ArmadilloDriver",
  profile = "xenon")

builder$append(
  server = "groningen",
  url = url,
  token = token,
  table = "armadillo-illustration/groningen/pancreatic",
  driver = "ArmadilloDriver",
  profile = "xenon")

logindata <- builder$build()

conns <- DSI::datashield.login(logins = logindata, assign = TRUE, symbol = "pancreatic")

ds.levels("pancreatic$diagnosis")

Expected behaviour

Levels returned for variable "pancreatic$diagnosis"

Actual behaviour

R

$barcelona$Levels
character(0)

$barcelona$ValidityMessage
[1] "VALID ANALYSIS"

$groningen
$groningen$Levels
character(0)

$groningen$ValidityMessage
[1] "VALID ANALYSIS"

Warning messages:
1: In unserialize(content) :
  cannot unserialize ALTVEC object of class 'arrow::array_string_vector' from package 'arrow'; returning length zero vector
2: In unserialize(content) :
  cannot unserialize ALTVEC object of class 'arrow::array_string_vector' from package 'arrow'; returning length zero vector

Log

07:55:07.569 [pool-2-thread-45|] INFO  o.m.armadillo.audit.AuditLogger - AuditEvent [timestamp=2024-03-05T07:55:07.569258319Z, principal=t.j.cadman@umcg.nl, type=EXECUTE, data={expression=levelsDS(pancreatic$diagnosis), sessionId=20ADFB67A85AAE6D7936801A0AC3759F, roles=[ROLE_GVILPPA0Y1_RESEARCHER, ROLE_XENON-TESTS_RESEARCHER, ROLE_R9JYVKOKD5_RESEARCHER, ROLE_UNRLGMFHCG_RESEARCHER, ROLE_SU]}]

R session Info

R version 4.2.2 (2022-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS/LAPACK: /opt/conda/lib/libopenblasp-r0.3.21.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.3           forcats_1.0.0             stringr_1.5.1             dplyr_1.1.4               purrr_1.0.2              
 [6] readr_2.1.5               tidyr_1.3.1               tibble_3.2.1              ggplot2_3.4.4             tidyverse_2.0.0          
[11] dsHelper_1.1.0            dsBaseClient_6.3.0        DSMolgenisArmadillo_2.0.3 MolgenisAuth_0.0.25       DSI_1.5.0                
[16] R6_2.5.1                  progress_1.2.3           

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.12         lattice_0.20-45     prettyunits_1.2.0   digest_0.6.34       utf8_1.2.4          backports_1.4.1     evaluate_0.23      
 [8] httr_1.4.7          pillar_1.9.0        rlang_1.1.3         curl_5.2.0          rstudioapi_0.15.0   minqa_1.2.6         data.table_1.15.0  
[15] nloptr_2.0.3        Matrix_1.6-5        checkmate_2.3.1     rmarkdown_2.25      mathjaxr_1.6-0      labeling_0.4.3      urltools_1.7.3     
[22] splines_4.2.2       lme4_1.1-35.1       triebeard_0.4.1     munsell_0.5.0       compiler_4.2.2      numDeriv_2016.8-1.1 xfun_0.42          
[29] pkgconfig_2.0.3     base64enc_0.1-3     htmltools_0.5.7     tidyselect_1.2.0    fansi_1.0.6         crayon_1.5.2        tzdb_0.4.0         
[36] withr_3.0.0         MASS_7.3-58.3       grid_4.2.2          jsonlite_1.8.8      nlme_3.1-162        gtable_0.3.4        lifecycle_1.0.4    
[43] magrittr_2.0.3      metafor_4.4-0       scales_1.3.0        metadat_1.2-0       cli_3.6.2           stringi_1.8.3       farver_2.1.1       
[50] generics_0.1.3      vctrs_0.6.5         boot_1.3-28.1       tools_4.2.2         glue_1.7.0          hms_1.1.3           fastmap_1.1.1      
[57] yaml_2.3.8          timechange_0.3.0    colorspace_2.1-0    knitr_1.45 
timcadman commented 7 months ago

Fixed by running 'install.packages("arrow")' on the CAS. Installed version: arrow_14.0.2.1. However I would have expected arrow to either be installed on CAS or be a dependency of 'DSMolgenisArmadillo'?

marikaris commented 7 months ago

Arrow is installed on the CAS, but a very old version of it and in a very special way. Don't remember why. Need to ask @DickPostma