Closed RicardoJSandoval closed 10 months ago
Hi @jenno-verdonck! Thanks for the comments and catching those typos. I'm also not entirely sure which is the best way to proceed with testing, so I'll leave it up to the others to decide and I'm happy to make any changes.
Hi @RicardoJSandoval - I was curious if you'd still like to move ahead with this pull request. I recall that the goal is to add a SIPP data source and this is an intermediate step.
If we're going ahead with this, could you update the version number to '0.13' in setup.py
and folktables/__init__.py
?
On another note, is this update fully backwards compatible with the existing ACS code?
Thanks!
In this pull request I'm introducing two main things: (1) a refactored
ACSDataSource
, (2) tests for theACSDataSource
.Refactored
ACSDataSource
The idea behind refactoring the
ACSDataSource
object is to encapsulate the functionalities and dependencies specific to the ACS datasets into one file & object. The refactoredACSDataSource
now uses the utility functions I had previously created. With this, we now have a pattern to follow in case we want to extendfolktables
to support other datasets.Tests for the
ACSDataSource
I created some tests that check that the
ACSDataSource
(down)loads different datasets as expected. I created some artificial datasets that contain the same columns as the corresponding, original, datasets, but contain less rows and random values. This smaller, artificial, datasets reduce the overhead of running the tests. Furthermore, I've written this tests such that they don't make an HTTP request to the Census Bureaus' website every time they're run, but rather we capture the requests and return the artificial datasets. The only exception to this is the testtest_definitions_download
which I directly copied from a previous merge (this test does make an HTTP request to the actual website).