0:37 Speaker Introduction
4:08 Example scenario and average Pandas-based processing techniques
6:38 Chaining
7:58 Creating a function to simplify your chain
8:58 Debugging your chain with .pipe and custom functions like size() and store()
11:27 Debugging with Jupyter: set_trace() and the iPython debugger
13:30 Debugging errors after the fact with %debug
14:33 Using ‘??’ to get source code for a method or function
14:58 Testing your code with pytest and sample input data
19:01 ipytest supports native use of pytest within Jupyter notebook
19:47 Using pandera and hypothesis libraries to test assertions about data and code
22:15 Using great_expectations library to make assertions about data and schema
26:20 Thanks and resources
26:26 Q&A — would you make an expectation for each function?
27:29 Q&A — how do you choose between pandera and great_expectations?
27:48 Q&A — can we access the notebook?
28:00 Q&A — can hypothesis save schemas so that the input data is not necessary to run tests?
28:24 Q&A — what is the appropriate amount of testing?
28:47 Q&A — how good is pandera at generating data for hypothesis?
29:19 Q&A — do you have recommendations for testing Jupyter notebooks?
Video URL: https://www.youtube.com/watch?v=Kj1WwpPFr-I
Contents
0:37 Speaker Introduction 4:08 Example scenario and average Pandas-based processing techniques 6:38 Chaining 7:58 Creating a function to simplify your chain 8:58 Debugging your chain with .pipe and custom functions like size() and store() 11:27 Debugging with Jupyter: set_trace() and the iPython debugger 13:30 Debugging errors after the fact with %debug 14:33 Using ‘??’ to get source code for a method or function 14:58 Testing your code with pytest and sample input data 19:01 ipytest supports native use of pytest within Jupyter notebook 19:47 Using pandera and hypothesis libraries to test assertions about data and code 22:15 Using great_expectations library to make assertions about data and schema 26:20 Thanks and resources 26:26 Q&A — would you make an expectation for each function? 27:29 Q&A — how do you choose between pandera and great_expectations? 27:48 Q&A — can we access the notebook? 28:00 Q&A — can hypothesis save schemas so that the input data is not necessary to run tests? 28:24 Q&A — what is the appropriate amount of testing? 28:47 Q&A — how good is pandera at generating data for hypothesis? 29:19 Q&A — do you have recommendations for testing Jupyter notebooks?