calculate_sd does not use information on missing values to guess variable type.
Checks need to be altered so any presence of missing values results in
the assumption that the variable is a categorical variable. Furthermore,
the macro should probably trigger an error if any kind of missing data is
found, since missing data in numeric variables is probably not intended.
Maybe make additional toggle parameter that allows missing values
for specified categorical variables.
data dat00;
call streaminit(1);
do i = 1 to 10000;
exposure = rand("bernoulli", 0.3);
gender = rand("bernoulli", 0.5);
if rand("bernoulli", 0.05) = 1 then gender = .;
output;
end;
drop i;
run;
calculate_sd does not use information on missing values to guess variable type. Checks need to be altered so any presence of missing values results in the assumption that the variable is a categorical variable. Furthermore, the macro should probably trigger an error if any kind of missing data is found, since missing data in numeric variables is probably not intended. Maybe make additional toggle parameter that allows missing values for specified categorical variables.
data dat00; call streaminit(1); do i = 1 to 10000; exposure = rand("bernoulli", 0.3); gender = rand("bernoulli", 0.5); if rand("bernoulli", 0.05) = 1 then gender = .; output; end; drop i; run;
%calculate_sd( in_ds = dat00, out_ds = out00, group_var = exposure, var = gender );
proc print data = out00; run;
/ var sd gender 0.020615 /