ctsit / redcapcustodian

Simplified, automated data management on REDCap systems
Other
12 stars 6 forks source link

`sync_table_2` regex does not properly escape a string replacement #80

Open ChemiKyle opened 1 year ago

ChemiKyle commented 1 year ago

https://github.com/ctsit/redcapcustodian/blob/a559cba7293a67b5c3fd7cb8e8ca4c6b11e6f36c/R/write_data.R#L231

The gsub call in this line treats . as a wildcard, not a literal . character.

To reproduce:

library(tidyverse)
library(redcapcustodian)

conn <- DBI::dbConnect(RSQLite::SQLite(), dbname = ":memory:")

table_names <- c(
  "redcap_projects"
)

for (table_name in table_names) {
  rcc.billing::create_and_load_test_table(
    table_name = table_name,
    conn = conn,
    load_test_data = T,
    is_sqllite = T
  )
}

redcap_projects <- tbl(conn, "redcap_projects") %>%
  collect()

updated_projects <- redcap_projects %>%
  mutate(status = 1)

# throws error
project_update_sync_activity <- sync_table_2(
  conn = rc_conn,
  table_name = "redcap_projects",
  source = updated_projects,
  source_pk = "project_id",
  target = redcap_projects,
  target_pk = "project_id"
)

# successful
project_update_sync_activity <- sync_table_2(
  conn = rc_conn,
  table_name = "redcap_projects",
  source = updated_projects %>%
    select(-contains("x")),
  source_pk = "project_id",
  target = redcap_projects,
  target_pk = "project_id"
)

Erroneous diagnosis:

https://github.com/ctsit/redcapcustodian/blob/a559cba7293a67b5c3fd7cb8e8ca4c6b11e6f36c/R/write_data.R#L246-L251

While attempting to make an ETL, I've come across an odd limitation in sync_table_2 while investigating a failure to update redcap_projects. On my machine, update_records has a hard upper limit of 133 columns, at 134 I get Error: no such column:.