CHOP-CGTInformatics / REDCapTidieR

Makes it easy to read REDCap Projects into R
https://chop-cgtinformatics.github.io/REDCapTidieR/
Other
33 stars 8 forks source link

Improved `read_redcap()` warnings for databases with access restrictions #123

Closed ezraporter closed 1 year ago

ezraporter commented 1 year ago

Description

This PR implements an improved scheme for issuing warnings/errors when a user has restricted access to data in the REDCap project. There are 3 cases to consider:

  1. User gets back no data from any instruments because of PHI/de-identified or user doesn't have access to any requested instrument
    • User gets an error saying no data was returned
  2. User gets back partial data for an instrument because of PHI/de-identified data
    • User gets a warning listing the missing fields
  3. User gets back no data for an instrument because all data is PHI/de-identified or user doesn't have access to the instrument
    • User gets a warning that no data was returned and the instrument won't appear in the output

Proposed Changes

List changes below in bullet format:

Screenshots error

warning

Issue Addressed

closes #38

PR Checklist

Before submitting this PR, please check and verify below that the submission meets the below criteria:

Code Review

This section to be used by the reviewer and developers during Code Review after PR submission

Code Review Checklist

ezraporter commented 1 year ago

CI seems to be failing due to issues connecting to the Azure Ubuntu Archive:

Could not connect to azure.archive.ubuntu.com:80 (52.252.75.106), connection timed out

They seem to pass when rerunning but take longer than usual. May be an issue (hopefully temporary) with Azure or GitHub. Folks seem to be discussing here: https://github.com/actions/runner-images/issues/675

skadauke commented 1 year ago

I think using |> in the vignettes is good style. Using %>% in the package is good for backwards compatibility.


From: ezraporter @.> Sent: Monday, February 6, 2023 4:19 PM To: CHOP-CGTInformatics/REDCapTidieR @.> Cc: Stephan Kadauke @.>; Review requested @.> Subject: [External]Re: [CHOP-CGTInformatics/REDCapTidieR] Improved read_redcap() warnings for databases with access restrictions (PR #123)

@ezraporter commented on this pull request.


In R/checks.Rhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCHOP-CGTInformatics%2FREDCapTidieR%2Fpull%2F123%23discussion_r1097930375&data=05%7C01%7Ckadaukes%40chop.edu%7C602a2f13c40a485c84e908db0887cfe9%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638113151623920248%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4QqGGjSqdT3nrfSeftMU28LQhbI5xv5%2F3Eg0BggzRUw%3D&reserved=0:

 select("field_name_updated", "form_name") %>%

group_by(.data$form_name) %>%

  • summarise(fields = list(.data$field_name_updated))
  • filter(!all(.data$field_name_updated %in% db_fields)) |>

Ooo good catch. Fixing this.

I did a find all to make sure and it looks like the only place we have |> is in the vignettes. Does that matter? Should we switch to %>% there to be safe?

— Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCHOP-CGTInformatics%2FREDCapTidieR%2Fpull%2F123%23discussion_r1097930375&data=05%7C01%7Ckadaukes%40chop.edu%7C602a2f13c40a485c84e908db0887cfe9%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638113151623920248%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4QqGGjSqdT3nrfSeftMU28LQhbI5xv5%2F3Eg0BggzRUw%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACTGHWQWELV4ZCQQ5BBRUDLWWFTFLANCNFSM6AAAAAAUS4YEEA&data=05%7C01%7Ckadaukes%40chop.edu%7C602a2f13c40a485c84e908db0887cfe9%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638113151623920248%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=X%2BoiHk5G26VS2%2B84yBbZNvfxB34F5kML2J%2BG83qjZv0%3D&reserved=0. You are receiving this because your review was requested.Message ID: @.***>

This email originated from an EXTERNAL sender to CHOP. Proceed with caution when replying, opening attachments, or clicking links. Do not disclose your CHOP credentials, employee information, or protected health information to a potential hacker.

skadauke commented 1 year ago

I think we want to stick to "instruments" in the messages since that word has a more specific meaning in the context of REDCap than "forms". We use "forms" in the code since the REDCap API uses "forms".

From the Glossary:

[cid:d28b5724-c0df-4b9c-92be-7396a6e4a55b]


From: Rich Hanna @.> Sent: Monday, February 6, 2023 4:07 PM To: CHOP-CGTInformatics/REDCapTidieR @.> Cc: Stephan Kadauke @.>; Review requested @.> Subject: [External]Re: [CHOP-CGTInformatics/REDCapTidieR] Improved read_redcap() warnings for databases with access restrictions (PR #123)

@rsh52 requested changes on this pull request.

Logic is all sound, a few comments about language and syntax.


In R/checks.Rhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCHOP-CGTInformatics%2FREDCapTidieR%2Fpull%2F123%23discussion_r1097911165&data=05%7C01%7Ckadaukes%40chop.edu%7C250eb80dcfca4770662708db08862476%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638113144480203709%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=RbKCygHxY3jgF9IHpcgQ%2Blb6zaIJwW2AxHdX5duH9VA%3D&reserved=0:

 select("field_name_updated", "form_name") %>%

group_by(.data$form_name) %>%

  • summarise(fields = list(.data$field_name_updated))
  • filter(!all(.data$field_name_updated %in% db_fields)) |>

Unsure if I've missed this in other PRs, but I don't think we've updated this package to use base R pipes yet did we? We may want to double check since I see in the difference here that there's a mix of %>% and |>


In R/checks.Rhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCHOP-CGTInformatics%2FREDCapTidieR%2Fpull%2F123%23discussion_r1097916604&data=05%7C01%7Ckadaukes%40chop.edu%7C250eb80dcfca4770662708db08862476%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638113144480203709%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=C8alT0K9udVVFgnxWf9yZHk3In5uw9aWOGDSK%2Bwr2QM%3D&reserved=0:

  • missing_db_metadata <- db_metadata %>% # nolint: object_usage_linter
  • filter(!.data$field_name_updated %in% names(db_data)) %>%
  • Set condition components used in both error and warning

  • msg_info <- c("i" = "This can happen when the user privileges are not set to allow exporting certain
  • instruments via the API.")
  • cond_class <- c("redcap_user_rights", "REDCapTidieR_cond")
  • db_fields <- names(db_data)
  • Fields in metadata excluding record ID

  • metadata_fields <- setdiff(db_metadata$field_name_updated, get_record_id_field(db_data))
  • If no data fields were returned the user has no data access

  • if (length(intersect(db_fields, metadata_fields)) == 0) {
  • cli_abort(
  • c(
  • "x" = "The REDCap API returned no data for any of the requested instruments.",

I think we should replace all messages that say "instruments" with "forms" since the argument is forms = and it's also the column name in the output super tibble (redcap_form_name).


In R/checks.Rhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCHOP-CGTInformatics%2FREDCapTidieR%2Fpull%2F123%23discussion_r1097916785&data=05%7C01%7Ckadaukes%40chop.edu%7C250eb80dcfca4770662708db08862476%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638113144480203709%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=AivJonY8URE8AtsDjHV3KRfXTBN9Jzko2AK8EX4qEeM%3D&reserved=0:

  • missing_db_metadata,
  • .f = function(form_name, missing_fields, all_fields_missing) {
  • if (all_fields_missing) {
  • cli_text("Instrument {.code {form_name}} returned no data and will be removed from the output.") |>
  • cli_fmt(collapse = TRUE, strip_newline = TRUE)
  • } else {
  • cli_text("Instrument {.code {form_name}} is missing {qty(missing_fields)} field{?s}
  • {.code {missing_fields}}.") |>
  • cli_fmt(collapse = TRUE, strip_newline = TRUE)
  • }
  • })
  • names(msg) <- rep("!", length(msg))
  • msg <- c(
  • "!" = "{qty(missing_db_metadata$form_name)}{?An/Some} instrument{?s} exist{?s/} in the

See instrument/form comment above

— Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCHOP-CGTInformatics%2FREDCapTidieR%2Fpull%2F123%23pullrequestreview-1285999987&data=05%7C01%7Ckadaukes%40chop.edu%7C250eb80dcfca4770662708db08862476%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638113144480203709%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=rsoeGeaiAzWTLY07FJh2jdzTtbQLGv%2BLX6cGe3Z0gvA%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACTGHWWPYAJ3Y5I7BW4OO3TWWFRYPANCNFSM6AAAAAAUS4YEEA&data=05%7C01%7Ckadaukes%40chop.edu%7C250eb80dcfca4770662708db08862476%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638113144480203709%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dQ%2Bakqbw8cCXP2ZhiKM7mJaeE360qhfA%2Bk2W0O5%2FbQY%3D&reserved=0. You are receiving this because your review was requested.Message ID: @.***>

This email originated from an EXTERNAL sender to CHOP. Proceed with caution when replying, opening attachments, or clicking links. Do not disclose your CHOP credentials, employee information, or protected health information to a potential hacker.