sara-chronister / syndrome-definition-evaluation

R code for evaluation of NSSP BioSense ESSENCE syndrome definition results using ESSENCE APIs.
13 stars 5 forks source link

Dev2024 Branch: Detect_Elements() Bug -- #26

Closed DOH-TJB0303 closed 3 months ago

DOH-TJB0303 commented 4 months ago

image

Background:

Development on detect_elements() in 2_process_data.R took place with example definitions that looked for element and visit matches across multiple fields (i.e. defX_list$setup$detect_elements_fields include > 1 field). Ex: ChiefComplaintUpdates, DischargeDiagnosis

In lines 101-108 of 2_process_data.R, multiple detect_elements() output data frames are created (1 dataframe per field), and then bound (via bind_rows()) together by C_BioSense_ID into a main detect_elements data frame with everything to review.

The problem was bind_rows() would result in duplicate column variables C_BSI...1, C_BSI...17, etc --> which had to be taken care of (hence the line 104 causing the error).

Fix

Use reduce(left_join) to eliminate duplicate columns and have this code work even if only 1 field is being used for detect_elements(). Additionally, need to remove creation of TruePositive column within detect_elements() and instead just create it outside of the function (if not then the function would make many TruePositive columns