Closed godber closed 1 month ago
After further discussion, I don't think this is what we need. What is needed is to annotate jobs that have upstream dependencies with the job_id
of that upstream job so that it can be paused. It's possible there are cases where this label would be useful, but it would be a low impact change so I am just going to close this.
We have this notion of "Stateful Workers" as described here:
https://terascope.github.io/teraslice/docs/configuration/clustering-k8s/#stateful-workers
In Teraslice jobs you can set the
"stateful": true,
property and the workers will be slightly different (see docs). I've realized that this is actually a property of the asset in use, and should probably be set automatically simply by using a "stateful" asset.In general this is the stateful processor being used:
https://github.com/terascope/elasticsearch-assets/tree/master/asset/src/elasticsearch_state_storage
This probably needs some internal API added to Teraslice that the processor will use to indicate that it is stateful. Perhaps a property on the processor that Teraslice uses.