jsolon-ncp / sampa-radio

0 stars 0 forks source link

CRF 13: Continuous measurements > 100 for pancreatic ultrasound are probably missing (-1##) thus miscoded #5

Closed jsolon-ncp closed 1 month ago

jsolon-ncp commented 1 year ago

Upon inspecting the continuous measurements of USS ultrasounds, you find outliers at both ends. The negative values can be dealt with in the analysis .do files. However, the values > 100 probably have to be dealt with in the next cleaning of the master data. This is probably a -1## response.

stripplot pan_head_trans pan_head_ap pan_body_trans pan_tail_trans

shows the negative values which we can fix in the analysis .do file. There are also values > 100 for transverse readings of body and tail which should probably have been a -1##.

image

list sampa_id cohort pan_body_trans pan_tail_trans pan_head_* if pan_body_trans > 100 & pan_body_trans !=.

gives you this (which can be corrected in analysis .do file in the interim, but maybe at source as well.

image

jsolon-ncp commented 1 year ago

To find all rows with > 100 for uss pancreatic measurements and not missing (ie miscoded as + but missing)

foreach var of varlist pan_head_trans pan_head_ap pan_body_trans pan_tail_trans {
    list sampa_id cohort pan_head_trans pan_head_ap pan_body_trans pan_tail_trans if `var'>100 & `var'!=.
    }

image

jsolon-ncp commented 1 year ago

To fix at source :

foreach var of varlist pan_head_trans pan_head_ap pan_body_trans pan_tail_trans {
    replace `var' = `var'*(-1) if `var'>100 & `var'!=.
    }