molgenis / molgenis-service-armadillo

Armadillo; a DataSHIELD implementation, part of the MOLGENIS suite
https://molgenis.github.io/molgenis-service-armadillo/
GNU Lesser General Public License v3.0
7 stars 10 forks source link

Assigning data via the central analysis server results in error #637

Closed marikaris closed 8 months ago

marikaris commented 9 months ago

How to reproduce

CAS + OIDC + Armadillo 4

Login to the central analysis server and execute this code (on an armadillo 4 server):

library(DSI)
library(dsBaseClient)
library(DSMolgenisArmadillo)

url <- "https://dev-armadillo.molgenis.org/"
token <- armadillo.get_token(url)
builder <- DSI::newDSLoginBuilder()
builder$append(server = "gecko", url = url, token = token, table = "gecko/2_1_core_1_0/non_rep", driver = "ArmadilloDriver", profile = "xenon") 
logindata <- builder$build()
conns <- datashield.login(logins = logindata, symbol = "core_nonrep", variables = c("coh_country", "height_m"), assign = TRUE)

Output:

Logging into the collaborating servers
  Logged in all servers [================================================================] 100% / 2s

Assigning table data...
  Assigning gecko (gecko/2_1_core_1_0/non_rep) [===================>---------------------]  50% / 0sError in json_content$status : $ operator is invalid for atomic vectors

Interestingly, despite this error, the data appears to be assigned and usable though:

> ds.mean("core_nonrep$height_m", datasources = conns)
  Aggregated (meanDS(core_nonrep$height_m)) [============================================] 100% / 0s
$Mean.by.Study
      EstimatedMean Nmissing Nvalid Ntotal
gecko      60.37567        0   3000   3000

$Nstudies
[1] 1

$ValidityMessage
      ValidityMessage 
gecko "VALID ANALYSIS"

> ds.histogram(x = "core_nonrep$height_m", datasources = conns)
  Aggregated (exists("height_m", core_nonrep)) [=========================================] 100% / 0s
  Aggregated (classDS("core_nonrep$height_m")) [=========================================] 100% / 0s
  Aggregated (histogramDS1(core_nonrep$height_m,1,3,0.25)) [=============================] 100% / 0s
  Aggregated (...) [=====================================================================] 100% / 0s
Warning: gecko: 0 invalid cells
$breaks
 [1]   0.9539448  13.4542283  25.9545119  38.4547954  50.9550789  63.4553624  75.9556460  88.4559295
 [9] 100.9562130 113.4564965 125.9567801

$counts
 [1] 318 315 330 297 350 270 315 293 325 187

$density
 [1] 0.008479808 0.008399809 0.008799800 0.007919820 0.009333122 0.007199837 0.008399809 0.007813156
 [9] 0.008666470 0.004986554

$mids
 [1]   7.204087  19.704370  32.204654  44.704937  57.205221  69.705504  82.205788  94.706071
 [9] 107.206355 119.706638

$xname
[1] "xvect"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"

In the logs we see the following happening:

14:44:11.181 [http-nio-8080-exec-9|] INFO  o.m.armadillo.audit.AuditLogger - AuditEvent [timestamp=2024-02-12T14:44:11.181748319Z, principal=anonymousUser, type=AUTHORIZATION_FAILURE, data={details=WebAuthenticationDetails [RemoteIpAddress=13.69.83.226, SessionId=2FD0F3776740E4DA632DD423516C0783]}]

When turning up the loglevel in the application.yml (logging.level.org.apache.coyote.http11.Http11InputBuffer=DEBUG), I saw the following:

Host: dev-armadillo.molgenis.org
Connection: close
user-agent: libcurl/7.88.1 r-curl/5.1.0 httr/1.4.7
accept-encoding: deflate, gzip
accept: application/json, text/xml, application/xml, */*
cookie: JSESSIONID=0829302C1214B8807988E1EC28F1CA2A
x-forwarded-proto: https
x-forwarded-for: 13.69.83.226

]

Local R studio + OIDC + Armadillo 4

When we run the same code in our local R studio (not via the CAS), we don't get an error and we see this in the logs:

14:44:51.848 [http-nio-8080-exec-5|] INFO  o.m.armadillo.audit.AuditLogger - AuditEvent [timestamp=2024-02-12T14:44:51.848782113Z, principal=m.k.slofstra@umcg.nl, type=AUTHENTICATION_SUCCESS, data={details=WebAuthenticationDetails [RemoteIpAddress=144.178.245.101, SessionId=545A3D4A6A744DC37CC7DF0513370119], authorities=[ROLE_GECKO_RESEARCHER]}]
14:44:51.883 [http-nio-8080-exec-6|] INFO  o.m.armadillo.audit.AuditLogger - AuditEvent [timestamp=2024-02-12T14:44:51.882971179Z, principal=m.k.slofstra@umcg.nl, type=AUTHENTICATION_SUCCESS, data={details=WebAuthenticationDetails [RemoteIpAddress=144.178.245.101, SessionId=545A3D4A6A744DC37CC7DF0513370119], authorities=[ROLE_GECKO_RESEARCHER]}]

and this:

Host: dev-armadillo.molgenis.org
Connection: close
user-agent: libcurl/7.88.1 r-curl/5.1.0 httr/1.4.7
accept-encoding: deflate, gzip
accept: application/json, text/xml, application/xml, */*
authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6Ik5Ld1I5cXZQQTVNVWlrT0ptbFJ0Y080bkE4ayJ9.eyJhdWQiOiIwOTBlNDVmZi0xY2EyLTQyMGMtOTEzOS05MDU0NWExNGU2MzMiLCJleHAiOjE3MDc4MzM1NTEsImlhdCI6MTcwNzgyOTk1MSwiaXNzIjoiaHR0cHM6Ly9saWZlY3ljbGUtYXV0aC5tb2xnZW5pcy5vcmciLCJzdWIiOiI5OGJkYjAxYS02NDQ1LTRkYmUtODUwNS01ZjM4NzIxMTVhYjMiLCJqdGkiOiJlOTc3OTE0Yi1iYTAxLTQ2MWUtYjdmMi1jNTQ5NTU3ZDc1MGQiLCJhdXRoZW50aWNhdGlvblR5cGUiOiJQQVNTV09SRCIsImVtYWlsIjoibS5rLnNsb2ZzdHJhQHVtY2cubmwiLCJlbWFpbF92ZXJpZmllZCI6dHJ1ZSwiYXRfaGFzaCI6IktZclc0ZG4zd19EWV9DWVRZVnUyUnciLCJhcHBsaWNhdGlvbklkIjoiMDkwZTQ1ZmYtMWNhMi00MjBjLTkxMzktOTA1NDVhMTRlNjMzIiwicm9sZXMiOltdfQ.vmgtzgD54nGJ9KBVl6BD3PQZIHMfIF1NT8ZNs5fQWtPbUro85vQcuWU9cKbo7bxO_pah_RH1MlJR-OJQtQhuNbAYNbd62PkThS_gxfLb1gpd0hDYT5299LOUHqM5Fr-bMKxnORsg1ooaUtF9vR-b1ya3xnCK2rSsCGk8Jp3UpoNMNcD5IM6GcZFsRyrp7DvYrXQl109AOpYXmIxT81QP8BcPOXSVHbkc9aOuVToxH3HbCXpAbKsGMldh3-1Jr8asF910OrtgdiTe64qVrkM6IkFF6oeBDvEbVeJU5c-OoabNt8fjGDKUBr6JySkpjZpijak5pMuWj7CCkQUDh8bjZw
cookie: JSESSIONID=A0DC514CD933EC9D43D411AC45A6AFA7
x-forwarded-proto: https
x-forwarded-for: 13.69.83.226

]

This means that the token doesn't seem to get passed when using the CAS, explaining the AUTHENTICATION_FAILED and AnonymousUser.

marikaris commented 9 months ago

Tried different combinations (summary): X marks absence of error <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

  | Armadillo 3.4 | Armadillo 4.1 -- | -- | -- CAS + OIDC | X |   Local + OIDC | X | X CAS + Basic auth | X | X Local + Basic auth | X | X

Additionally: Rock on armadillo 3.4, succeeds in all cases (CAS/Local, OIDC/Basic) Tried with resources instead of tables as well. Then we do get the "$ operator is invalid for atomic vectors" error (only when running on the CAS using OIDC), but the authentication succeeds, user seems to be set and token gets passed, so since that error is not very specific, we'll focus on the issue with tables first because we're not sure this is related.

marikaris commented 9 months ago

Apparently, when we remove all variables in R studio and restart R, the issue might be fixed