maelstrom-research / Rmonize

3 stars 0 forks source link

Summary report updates: Harmonized dataset summary report > Numerical variable summary #87

Open twey2 opened 1 month ago

twey2 commented 1 month ago

Harmonized dataset summary report > Numerical variable summary

1) Column titles and order should be: Index Grouping variable: adm_study_id Variable name Variable label Mlstr_harmo::status Quality assessment comment Variable valueType Categorical variable Categories in data dictionary Number of rows Number of valid values Number of non-valid values Number of empty values % Valid values % Non-valid values % Empty values Number of distinct values Minimum 1st quartile Median 3rd quartile Maximum Mean Standard deviation 2) Keep only one column for valueType (delete Dataset valueType and Suggested valueType) and change name to “Variable valueType”. 3) Delete columns “% Valid categorical values (if applicable)” and “% Non-valid categorical values (if applicable)”. 4) Change all decimal values to show 2 decimal places. 5) For columns “Number of rows” to “% Empty values”: Validate carefully how they are calculated and that they match the updated column titles.

See attached mock-up file for reference. summary_report_harmo_validated.xlsx

GuiFabre commented 2 weeks ago

@twey2

Add the harmo status is not strait forward, I will find a solution. The rest is ok to be tested update : a solution was found, to be tested :)