ZaxR / bulwark

Bulwark is a package for convenient property-based testing of pandas dataframes.
GNU Lesser General Public License v3.0
222 stars 26 forks source link

Are you open to me adding "advice for testing your dataframes"? #46

Open ianozsvald opened 4 years ago

ianozsvald commented 4 years ago

Hello again. Prompted by a recent exchange on twitter: https://twitter.com/ianozsvald/status/1157181430595837952 I'm revisiting my approach to testing in R&D Jupyter Notebooks.

I plan to rework many of my assert statements with use of bulwark. Would you be open to be creating several paragraphs (possibly 1 page) of advice for the README to help others spot how & where they might want to test their code?

I'm thinking of both data loading (specifying some expectations about the resulting dataframe) and data manipulation (e.g. checking for a unique index after a join...which prompted the above tweet, also to check for the introduction of NaNs etc).

I talk this advice through in one of my classes and I could turn it into a blog post, but adding it here will be more useful to users and ought to help gain visibility for bulwark.

I'd welcome any thoughts you had on it of course, I'm just checking that you're open to the idea before I start putting some notes together.

ZaxR commented 4 years ago

That sounds great! The readthedocs currently has an Examples page that needs to be written (see #33), and I'm also open to a "Best Practices" page as well. Would really appreciate a PR for either/both of those.

ianozsvald commented 4 years ago

This I can tackle, probably as a single markdown page first for critique and then we could migrate it to a more suitable destination once it matures. More on this later.

ZaxR commented 4 years ago

Sounds good. The sphinx docs for Bulwark are now actually set up to take markdown files, do if you want, you can work on this file in the docs folder on Bulwark and we can collaborate via a PR.

ianozsvald commented 4 years ago

Hey there. Here's a first draft: https://github.com/ianozsvald/bulwark/blob/master/docs/advice.md

Thoughts?

ZaxR commented 4 years ago

Absolutely love it so far - I appreciate you selecting probably some of the most common use cases for each of the checks.

I view a lot of the value of bulwark as being in the decorator form, since often transformations are more complex than one-liners, so I'd like to have some detailed examples for those as well. I'd also like to showcase the use of multiple checks (the join example would also be a case where you might use has_no_nans for example).

Thinking about where this would fit into the docs, I think your first paragraph and the discussion about asserts belongs in some kind of "philosophy" page that appears before the "Design" page. The examples would fit well both within function docstrings and on the "Examples" page or even a new docs page titled like you had it, but a little shorter: "Common Uses".

How do you feel about all that?

mlisovyi commented 4 years ago

Was there a progress on this? It would be super helpful for beginners to have the Examples page populated (or Common Use-Cases- whatever you fancy as the name)

ianozsvald commented 4 years ago

@mlisovyi hey there. So, I stalled on this. My way of using bulwark is different to Zax's, I meant to try decorating functions and I ended up doing nothing to do with data prep/validation for many months. You are very welcome to edit/chop/change my suggestions if you'd like to pick this up, I'd be happy to try to leave useful feedback if you make progress. I'll note that I'm also unlikely to make any useful contributions beyond the odd comment, my wife & I are 2 months due to give birth, life is changing, lockdown is weird and I've got clients+community conference talks ahead in the next couple of months, so please forgive me if I'm quiet

mlisovyi commented 4 years ago

ok, makes sense. I can give it try over the next weekend. @ZaxR do you mind to specify what would you see as a minimal set of changes:

P.S. @ianozsvald congrats on the upcoming family extension :)