counterdata-network / story-processor

Story discovery engine for the Counterdata Network. Grabs relevant stories from various APIs, runs them against bespoke classifier models, post results to a central server.
Apache License 2.0
0 stars 2 forks source link

design integration test approach #72

Open rahulbot opened 4 months ago

rahulbot commented 4 months ago

Debugging full pipeline runs is too difficult and error-prone right now. The pathway we need to exercise in a testing environment goes from fetcher -> queue -> classification worker -> above-threshold results. A few thoughts about how we could do this:

A test outline I'm imagining would look something like:

  1. create an empty SQLite database and use that as DATABASE_URL
  2. create an empty queue and use that as BROKER_URL
  3. call the appropriate queue_servicename_stories.py main method (with stubbed models, project file, and search results method)
  4. verify expected number of story entries are created in DB and they look right
  5. verify expected number of entries are in queue
  6. start queue worker pointing at no-op (or throwaway) models
  7. verify updated number of story entries are created in DB indicated stories that passed models

This could be used to test our fetching logic in each queue_servicename_stories.py, test cases like empty story lists, and overall give us confidence that we aren't breaking the overall integrated flow of the pipeline. How can we do this?