TheDataRideAlongs / ProjectDomino

Scaling COVID public behavior change and anti-misinformation
Apache License 2.0
61 stars 13 forks source link

pytest for fh transforms #2

Open lmeyerov opened 4 years ago

lmeyerov commented 4 years ago

For some sample tweets, test especially:

Less clear: diff search jobs

007vasy commented 4 years ago

Just to make things clear, the project has transformation functions, and you want to have tests that can clarify if those are working as intended?

lmeyerov commented 4 years ago

A few pieces:

  1. Just having pytest, and hooked up to a CI system, would already be a step up and provide a foundation for others :)

  2. For each of those, we currently have methods in https://github.com/TheDataRideAlongs/ProjectDomino/blob/master/modules/FirehoseJob.py for the above conversions for Twitter in particular ( cc @bechbd )

  3. We're starting to have other notebooks as well, such as for extracting URLs and blockchain addresses, that'd benefit from this as well. My guess is we'd find bugs + get the code cleaner & more modular anyways as part of this process of moving from Notebook prototypes to Python modules that get plugged into Prefect.io pipelines.

007vasy commented 4 years ago

do you have any CI system in mind?