ooni / data

OONI Data CLI and Pipeline v5
https://docs.ooni.org/data
8 stars 4 forks source link

Consider splitting ooni/data into separate sub-packages in a monorepo structure #57

Closed hellais closed 4 months ago

hellais commented 6 months ago

OONI Data at the moment serve 3 main purposes:

  1. Exposing an end user CLI tool installable via pip install oonidata
  2. Implementing the next gen data processing pipeline
  3. Exposing a REST API for the observation based measurements

Moreover as part of all of the above there is a fair amount of utility code that could be used as part of the data pipeline, without necessarily needing all the extra dependencies.

It would be worth considering to split this up into at least 3 packages which can be released and imported independently. We should look into what's the current best practice for this kind of python mono repo setup.

From a quick search people seem to mention the use of hatch or pants as possible solutions.

If we do this we should probably ditch poetry in favor of whatever else we pick.

hellais commented 4 months ago

This is done in #60