Closed ljwoodley closed 6 months ago
Some of your results don't match those from Philip's example.
p1414_token <- "your_token_here"
service_requests <- redcap_read(
redcap_uri = Sys.getenv("URI"),
token = p1414_token
)$data
service_requests <- service_request
rcc_billing_conn <- connect_to_rcc_billing_db()
rc_billing_conn <- rcc_billing_conn
result_l <- get_ctsi_study_id_to_project_id_map(service_requests, rcc_billing_conn)
# Read invoice line item records
extant_invoice_line_items <- tbl(rc_billing_conn, "invoice_line_item") |>
collect()
# Get unique, modern CTSI Study IDs for each REDCap Project
# Get them from extant_invoice_line_items. We need both the annual project
# billing line items and the service_request line items.
# We have to join the latter to the service request history to map
# service_request line items to the PIDs they relate to.
result_p <-
bind_rows(
extant_invoice_line_items |>
filter(service_type_code == 1 & !is.na(ctsi_study_id)) |>
arrange(desc(id)) |>
select(id, service_type_code, service_identifier, ctsi_study_id) |>
rename(project_id = service_identifier),
extant_invoice_line_items |>
filter(service_type_code == 2 & !is.na(ctsi_study_id)) |>
arrange(desc(id)) |>
select(id, service_type_code, service_identifier, ctsi_study_id) |>
inner_join(service_requests |> select(record_id, project_id) |>
mutate(project_id = as.character(project_id)) |>
mutate(record_id = as.character(record_id)),
by = c("service_identifier" = "record_id")
) |>
select(service_type_code, project_id, ctsi_study_id)
) |>
arrange(desc(id)) |>
distinct(project_id, ctsi_study_id)
result_l
result_p
foo <- inner_join(result_l, result_p, by = "project_id")
foo |>
filter(ctsi_study_id.x != ctsi_study_id.y)
# 32 rows
That's because those project ids map top multiple study ids.
result_p |>
add_count(project_id) |>
filter(n > 1) |>
arrange(project_id)
How should the duplicated project_ids
be handled @pbchase?
That's because those project ids map top multiple study ids. ... How should the duplicated
project_ids
be handled @pbchase?
Ah, now I understand. I have seen this before. The executive decision I made last time was to use the most modern ctsi_study_id for the project ID. I base this on my assumption that maybe it's more right the second time
ctsi study ids also map to multiple project ids. When this occurs the max project id is kept.
ctsi study ids also map to multiple project ids. When this occurs the max project id is kept.
No, you should keep all of the project IDs that map to a single CTSI_Study_ID.
I love this enormous function, but I have one issue--there is no test. I spec'd it so it could be testable, but there is no test.
I'd like you to write the test. I have been writing the test for these functions but I need to pass the torch. To that end, I documented how I do it. It's a bit involved, but I find it liberating. I hope I did a decent job of documenting it. I hope you like the method
Please read Unit tests with testthat, try to script that makes the test data from the real data, and try to write one test. Feel free to ask questions. I haven't had a lot of time to polish this. You are the first tester
Closes issue #208