This could also be named "remove Koza's assumption that only one ingest will run at a time." Koza expects to only have source file loaded (which contains the reader), and a single writer associated with that source.
This regularly causes odd test results when a single ingest fails and causes strange downstream side effects, and it also prevents running multiple ingests in parallel within the same running Python.
This might also take us down the road towards parallelizing within an ingest, which would be a huge improvement, but isn't what this ticket is about.
Acceptance Criteria
An integration test that runs two of the example ingests in parallel successfully by some method. (possibly Dask.delayed with Dask as a dev dependency? maybe base python multiprocessing?)
Implementation thoughts
self.source will need become a dictionary that maps a source name to the source instance
self.writer will also need to become a dictionary, mapping a source name to the source instance
within ingests, reading and writing may need to pass the ingest name, and possibly koza_app.next_row as well.
A totally different implementation might be to handle this at a higher level, so that rather than a pure singleton, there is a koza_app for each ingest - which means that cli_runner would not have a single global koza_app, but instead would have a dictionary of koza_apps, one for each source. It's possible that this would be the cleanest for backwards compatibility? Ingests that don't care about being parallel safe could import koza_app, for example, and to be parallel safe maybe instead there would be a koza_apps singleton is a dictionary mapping ingest name to the koza_app instance that belongs to a particular ingest.
Background
This could also be named "remove Koza's assumption that only one ingest will run at a time." Koza expects to only have source file loaded (which contains the reader), and a single writer associated with that source.
This regularly causes odd test results when a single ingest fails and causes strange downstream side effects, and it also prevents running multiple ingests in parallel within the same running Python.
This might also take us down the road towards parallelizing within an ingest, which would be a huge improvement, but isn't what this ticket is about.
Acceptance Criteria
An integration test that runs two of the example ingests in parallel successfully by some method. (possibly Dask.delayed with Dask as a dev dependency? maybe base python multiprocessing?)
Implementation thoughts
A totally different implementation might be to handle this at a higher level, so that rather than a pure singleton, there is a koza_app for each ingest - which means that cli_runner would not have a single global koza_app, but instead would have a dictionary of koza_apps, one for each source. It's possible that this would be the cleanest for backwards compatibility? Ingests that don't care about being parallel safe could import koza_app, for example, and to be parallel safe maybe instead there would be a koza_apps singleton is a dictionary mapping ingest name to the koza_app instance that belongs to a particular ingest.