WA-Department-of-Agriculture / soil-health-report-generator

https://tshapiro.shinyapps.io/soil-health/
MIT License
0 stars 0 forks source link

Develop Data Validation Rules for Excel Upload #4

Open tashapiro opened 1 month ago

tashapiro commented 1 month ago

Before generating reports, the application needs to validate the data upload from the filled out template. The rules should check for the following:

1) That data dictionary and data tabs exist in the file

2) Column check for data dictionary, must include:

3) Data Dictionary must include AT MINIMUM one record

4) Check for required columns in Data tab, make sure they are correct data types.

5) Is there at least one additional column Data tab in addition to required fields? (Needs at least one variable)

6) Ensure that additional columns present in Data tab are present in Data Dictionary. Reverse check to make sure all named keys in Data Dictionary appear as columns in Data tab.

7) Check for missing values in both tabs for all records

tashapiro commented 1 month ago

@jadeynryan, thoughts/questions:

2nd Validation Rule. I think we discussed removing order as a required field. to simplify it further, we could also omit abbr_unit and generate it in the back end by combining abbr and unit with the html break. what do you think?

image

7th Validation Rule. Will need some input from you here, which columns are a hardstop vs. warning? e.g. if they are missing lat/long for one record, should that bar them from generating the report?

jadeynryan commented 3 weeks ago

2nd rule: I think I can simplify by removing measurement_group_label, order, and abbr_unit. I'll need to refactor some of the .qmd code and {soils} functions.

4th rule: field_id doesn't have to be numeric,

7th rule: I did some refactoring today so that the map will just not appear, rather than have report generation halt in error, if the lat/long are missing.

The columns that must not be blank are:

Another rule to consider: sample_id must be unique.

jadeynryan commented 2 weeks ago

I simplified the data dictionary and refactored the code for the reports to still render okay in this commit to {soils}.