Analyticsphere / metricsReportsRequests

Used to provide issue tracking for changes and additions to the Connect Metrics reporting.
MIT License
0 stars 0 forks source link

RCA Metrics Report requirements #163

Open brotzmanmj opened 2 months ago

brotzmanmj commented 2 months ago

Requirements for a new weekly metrics report on RCA (Rapid Case Ascertainment). Suggest a new standalone report rather than adding to an existing report.

https://nih.app.box.com/file/1396324607069

Suggest a meeting to discuss before work begins on this, when the analytics team is ready to start.

KELSEYDOWLING7 commented 2 months ago

@brotzmanmj If/when the Operations report edits have been finalized I will have time to work on this. Please let me know when you'd like to meet to discuss before I start

brotzmanmj commented 2 months ago

Great! @cunnaneaq can you please set up a time for you, me and Kelsey to meet to review these requirements? Thanks

KELSEYDOWLING7 commented 1 month ago

I understand we still need to meet and discuss logistics and possible changes, but here are what the requested tables and plots would look like thus far based on Stage data. Right now there's only two Sites with test data and I don't believe there are any variables for the cancer stage just yet

RCA-Metrics.pdf

cunnaneaq commented 1 month ago

Thank you @KELSEYDOWLING7, it's great to see these come to life! We will review the variables as part of the meeting on 10/9, but the cancer stage variables is no longer being sent for RCA and we are also removing non-melanoma skin from the primary cancer site in the October release. Joe has documented the structure here

brotzmanmj commented 1 month ago

Hi Kelsey, This s a great first draft. A few things we can fix now... Please fix spelling errors throughout. Have another person on analytics team review to confirm all are identified and fixed.

Table 2 Update footnote: "1 As this is a select all question, participants may select multiple responses." (since this is not a survey, the selecting multiple responses doesn't apply here) Please change to: "Participants may have more than one cancer site".

Figure 1:

Table 4: In title, instead of 'as reported by Health Care Site' please change to 'as identified by RCA'

Tables 4 and 5 are the same (I prefer Table 4 over Table 5) so we should drop Table 5 unless this was intended to show something else that we're not quite capturing here.

Table 6: In title, instead of 'as reported by Health Care Site' please change to 'as identified by RCA'

Tables 7 and 8: Add Total row on bottom of each. And we need to add something to indicate what the percentage is out of. Is it out of the number of verified participants of the health care system in question?

Figure 2: Put this directly after Figure 1. Same date changes requested for figure 1.

Table 18: Change title to 'Vital Status of Participants with Cancer per Manual Chart Review at time of RCA data submission'

On the first page, the report title should be 'Connect Rapid Case Algorithm (RCA) Metrics' and that first page also needs a description of what the RCA is. @cunnaneaq can you draft a few sentences for this?

KELSEYDOWLING7 commented 1 month ago

Got it, I've made the all adjustments listed above, outside of the RCA description. Non-melanoma will be removed from the counts in prod, but because they have been selected in stage I kept them in for the counts to make sense. RCA-Metrics.pdf

brotzmanmj commented 1 month ago

Thanks Kelsey!

cunnaneaq commented 1 month ago

Adding some links and documentation here: RCA SOP Draft RCA data structure

Inclusion criteria:

Variable Variable Concept ID Response Response Concept ID
Verification status 821247024 Verified 197316935
Consent submitted 919254129 Yes 353358909
HIPAA authorization 558435199 Yes 353358909
Withdraw consent 747006172 No 104430631
HIPAA Revoked 773707518 No 104430631

Rules: If more than one primary cancer is diagnosed on the same date, submit each primary cancer as a separate occurrence (including bilateral cancers). The unique identifier for multiple primary cancers will be the accession number.

If the participant is confirmed “alive” by manual chart review, the API will only allow a response of “yes” to “Has the cancer diagnosis and the participant's awareness of their diagnosis been confirmed by manual review of the participant's chart?” “No” and “unknown” responses will be blocked.

If the participant is confirmed “dead” by manual chart review, the API will accept all responses - “yes” “no” and “unknown” to “Has the cancer diagnosis and the participant's awareness of their diagnosis been confirmed by manual review of the participant's chart?” However, the CCC will not administer a cancer diagnosis survey to participants that meet these criteria.

If the participant’s vital status is “unknown” after manual chart review, the API will accept all responses - “yes” “no” and “unknown” to “Has the cancer diagnosis and the participant’s awareness of their diagnosis been confirmed by manual review of the participant’s chart?” However, the CCC will not administer a cancer diagnosis survey to the participants that meet these criteria.

The updateParticipantData API will not allow RCA operational data to be updated or overwritten.

KELSEYDOWLING7 commented 1 month ago

Thanks! I've added those rules. Should I anticipate the accession ID will be sent from the sites and thus in the cancerOccurance table?

cunnaneaq commented 1 month ago

yes, that's right

brotzmanmj commented 1 month ago

Thanks Kelsey! Just let us know when you want us to review and see if there is anything else we should add to the QC

cunnaneaq commented 1 month ago

Description: This rapid case ascertainment (RCA) metrics report contains tables and plots from RCA data that the participating healthcare sites send to the Connect Coordinating Center (CCC). The RCA variables in this report are a limited set of necessary variables decided upon by the RCA Working Group and the CCC. The variables are comprised of data pulled from the electronic health record and data that was manually reviewed by site representatives. The purpose of receiving RCA data is to identify Connect participants who have been diagnosed with cancer sooner than this information is available for linkage in state tumor registries. These variables are only to be used operationally.

@brotzmanmj @KELSEYDOWLING7 please let me know if you have any edits to the description above. I used the Weekly Operations Report an an example. I think that it makes sense for this report to be run monthly, since that's how often sites will send data

KELSEYDOWLING7 commented 1 month ago

Thanks Aileen! I think this makes sense

brotzmanmj commented 1 month ago

This is perfect and comprehensive, thank you

KELSEYDOWLING7 commented 1 month ago

These are the current RCA rules. As we discussed, we will have to see if RCAOper_CancerFlag_v1r0 is being fully removed from prod. If so, I'll update the rules accordingly.

  1. If RCAOper_CancerFlag_v1r0=yes, then all cancer sites must be either yes or no; cannot be null
  2. RCAOper_CancerFlag_v1r0 must be yes or null, will null as the default
  3. If RCAOper_CancerFlag_v1r0=yes, then RCAOcc_CancerDt_v1r0 must be populated
  4. If 'other' cancer site is selected, then the text box must be populated
  5. If vital status is alive, then awareness of diagnosis must be yes
  6. If vital status is deceased, then awareness of diagnosis can be yes, no, or null
  7. If vital status is unknown, then awareness of diagnosis can be yes, no, or null

I'm not quite sure how to implement this rule as code: "If more than one primary cancer is diagnosed on the same date, submit each primary cancer as a separate occurrence (including bilateral cancers). The unique identifier for multiple primary cancers will be the accession number." I'm thinking of splitting this up into multiple rules:

brotzmanmj commented 1 month ago

Hi Kelsey,

For "submit each primary cancer as a separate occurrence", I think we should need to check that in any given occurrence (i.e. record, or row in the cancer occurrences table), there should be one and only one cancer site = yes; the rest should be 'no' within that occurrence. I don't think this requires comparing multiple occurrences for the same Connect ID to enforce this rule.

However... At the start, @cunnaneaq do we want, in a secure location on Box, a list of all Connect IDs with more than one than one occurrence along with their data: Date of diagnosis, Cancer Site, Accession ID (if any), and health care system ('Site'). This would just be for us to keep an eye on data related to multiple occurrences and make sure it is being handled as we expect it to be?

"The unique identifier for multiple primary cancers will be the accession number." I see your point, to make sure the same accession ID is not submitted for multiple records. @cunnaneaq what do you think? Do we want the program to compare the accession ID values across all occurrences and ensure they are all unique? I would say we need to do this only within healthcare system because they could be legitimately duplicative across health care systems.

cunnaneaq commented 4 weeks ago

I agree with your first point @brotzmanmj

I like the idea if keeping an eye on the Connect IDs that have multiple occurrences in Box.

Yes, if possible, requiring unique accession numbers within each healthcare site would be good, as long as this value being null doesn't mess up the program

KELSEYDOWLING7 commented 4 weeks ago

Thanks for the feedback! I've tested the code for these new rules in stage and they work! Once we have prod data I'll have this custom QC report run along side the automated RCA QC report.