cfpb / HMDA_Data_Science_Kit

Creative Commons Zero v1.0 Universal
41 stars 25 forks source link

Create new time series example that supports filtering #42

Open Kibrael opened 6 years ago

Kibrael commented 6 years ago

In addition to the example notebook showing filtering of TS data, create a notebook showing filtering of LAR data.

Allow filtering by:

The filter code should not restrict the product type, that will be handled in a separate example.

The code should teach the user how to get a pipe-delimited (and have an explanation of using a different delimiter) file of a LAR subset saved to their computer. This functionality should be explained and coded such that a user understands how to generate multiple flat files from a single code block.

Many HMDA data users are interested in subsets of the data that are small enough to be loaded into Excel. The filtering logic of this code should allow for multiple filters of multiple types to be passed and the resulting datasets returned.

Additionally, an example of basic Pandas usage for analysis and graphing should be provided. This code should load one of the flat files created and written to disk in the previous step. The analysis should take two different subsets, such as 2 MSAs or institutions and compare an aggregation measure such as total application volume. Extending the example to cover multiple years to show trends in the comparison would be idea.