brittanyblouin / ANCRTAdjust

An R package to adjust routine HIV testing data from antenatal care to reduce bias in estimating HIV prevalence trends
MIT License
2 stars 3 forks source link

Checking data structure #15

Open seabbs opened 4 years ago

seabbs commented 4 years ago

As far as I can see this package is very reliant on having a very specific data structure. Given this I think it is very important that you check the input data in some way before trying to use it.

This could take the form of checks within your already implemented functions or a separate function to check the data integrity.

For this # openjournals/joss-reviews/issues/1740 review

seabbs commented 4 years ago

After rereading the README I see some of this functionality is in place in the name_var function.

I think you should add these checks (via calling this function) to all other functions (or at least the data_clean).

A name change might help as well - perhaps something like check_data_structure.

I think you also need to be checking that the variables supplied have the required structure (on top of just names). By this I mean are they the correct type (numeric etc) and do they have the right levels (for factors).

For this #openjournals/joss-reviews/issues/1740 review

m-maheu-giroux commented 4 years ago

Good idea. We have implemented additional check to ensure that the data is correctly formatted and, to reflect this change, have renamed the function name_var() to check_data().

seabbs commented 4 years ago

Great - thanks.

What are your thoughts about adding this to data_clean. In my experience users often don't read docs and it can be nice to give them very clear error messages when things go wrong.