tonyfischetti / assertr

Assertive programming for R analysis pipelines
https://docs.ropensci.org/assertr
Other
472 stars 34 forks source link

Set assertions in/write assertions to human readable file? #130

Open pschloss opened 9 months ago

pschloss commented 9 months ago

I am wondering whether you all had previously considered reading in assert/verify statements from a file.

In my explorations of TDD with data analysis, I came across the nifty tdda python package. It appears to have a fair amount of overlap with what is doable in assertr. My sense from their white papers is that tdda can algorithmically create constraints by summarizing columns in a data frame and writing those to a file or some other data structure. Those constraints can be modified to fine tune them and re-used in subsequent assertions when new data is considered.

I think it would be pretty powerful to have a yaml or json file that specifies the assertions for a file that can be loaded by assertr and applied to a data frame within a pipeline. A benefit of having a file-based approach to this would be that the file could also serve as a type of data dictionary that would be more readable than assertr code.

I suspect this might be a fair amount of effort to implement as was curious if it was something that's already on your roadmap or if you would be interested in contributions along these lines.

tonyfischetti commented 8 months ago

That's so funny you mention that.. about a week back I realized I should have to implement something like this for the work I'm doing now. Not sure I want to have assertions written purely in JSON/yaml, or any other markdown language But maybe an assertr chain can be expressed in a separate file, and then included/sourced by the analysis script

I'll have to do some thinking about the best way to implement it (I want assertr to do the right thing) I'd love to hear more about your experience with tdda and anything else you were thinking regarding this isea