Open alexec opened 3 years ago
This would be fantastic to lower the operational costs of running a workflow archive since you either have to DIY it, or use managed postgres/mysql instances which can be relatively expensive.
Some initial thoughts since I was researching/thinking about it this morning:
I don't know enough about GCP Buckets, however I don't think S3 alone is sufficient for what we would need to implement an alternate archive location. If it were a simple "give me the results of this workflow", then we would be able to use the file name as the key and it would be relatively straightforward, however we have other access patterns like:
And any combination of the ones listed above. Implementing all of that with the LIST
operation is likely to be slow/expensive for anything beyond experimental setups, and it gets worse if you have a long TTL on your archived workflows.
There's precedent for this in other projects (see Cortex, Loki) for storing the bulk of data in an object store while maintaining a separate index that contains metadata to direct you to the correct objects. In the case of the projects listed above, they support options that include DynamoDB, Bigtable, and Cassandra.
Loki has also shipped with an index store called boltdb-shipper
recently which uses BoltDB as the index store and syncs the data to S3 which eliminates the requirement for a separate service like DynamoDB. While this is pretty neat, it may also imply the need for persistence and/or multiple replicas in order to avoid data loss in the event that a workflow controller is lost/crashes.
As an interim solution, what would you think about implementing something like an http
persistence layer? i.e.
persistence:
archive: false
archiveTTL: 180d
http:
url: https://my-experimental-archive-service
archiveLabelSelector:
matchLabels:
workflows.argoproj.io/archive-strategy: "always"
The idea behind that is rather than having the archive implementation built into Argo, we enable people to implement their own solutions by implementing a well-known API. That would allow us to iterate on solutions outside of the project itself, then we could eventually merge one or more solutions back in once they've been proven. The actual implementation of "official" solutions could then be moved to labs projects. One issue to consider here would be how to secure those new implementations; it could be something as simple as a shared secret initially.
See PR for instruction for testing.
Summary
Archived workflows could be written to artifact repository, e.g S3/GCP bucket
Use Cases
When would you use this?
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.