Closed zargot closed 1 year ago
I should probably include in the spec how valid tables say they are valid, in the summary report, unless that's something only the validation tool needs? (#149)
The
module-functions.qmd
file looks good but I'm looking for more from the specifications.Looking for something similar to the specifications for the rules,
- Text explaining why the function is needed
- Explaining how it will work with examples that can be used as test cases.
I made an attempt at addressing this now.
Also, this was not in the task but I think we discussed allowing the user to summarize by tables, rows, and/or columns.
I'm not sure about how we would implement this and how useful it would be. We are currently summarizing by tables, which is fine and logical. Summarizing by rows sounds like it would undo the summarization, since aggregating rows is its main feature. I'm not sure about columns. I suppose we could prefix them with table names, so it becomes like 'sites.siteID:
The way I see it, the function would summarize by table by default, but the user would also have the ability to further summarize by rows or columns (but not both). For example, if the addresses table has the following 4 errors,
By default the summary would be,
# Addresses
5 errors
greater_than_max_length: 2
missing_value_found: 1
duplicate_entries_found: 2
If the user wants to summarize also by rows:
# Addresses
## Row 1
3 errors
greater_than_max_length: 1
duplicate_entries_found: 1
missing_value_found: 1
## Row 2
2 errors
duplicate_entries_found: 1
greater_than_max_length: 1
and if the user wants to also summarize by column:
# Addresses
## addressID
3 errors
greater_than_max_length: 2
duplicate_entries_found: 1
## addL1
1 error
missing_value_found: 1
I think this is useful for a user to see if they want it. Especially if a table is very big and they want to drill down a little more into if the errors are localized in a particular column or rows.
As for implementation happy to talk about it if you can't visualize it, I've more experience with these kind of stratifications in other projects.
Also, I took a brief look at your latest commit. Can you also put in examples like in the validation rules, so I can see what the output looks like and also so I know what you will be testing. Specifically,
@yulric. Sounds good, I've made new changes.
@zargot I pushed a commit to high level spec making things clear. Feel free to rebase things to make it all clean.
fixes #74
adds spec: