thoth-station / thoth-application

Thoth-Station ArgoCD Applications
GNU General Public License v3.0
12 stars 22 forks source link

[Epic] Managing data across deployments #2513

Open fridex opened 2 years ago

fridex commented 2 years ago

Is your feature request related to a problem? Please describe.

As Thoth operator, I would like to make sure data are properly managed across deployments. As of now, we use a staging environment to compute data (given the resources available) and propagate a database dump to prod environments. This situation proved to be not scalable long-term as production can write into the database as well and we can get out of sync easily. If we overwrite the prod environment with staging data, we can also lose some information.

One of the proposed solutions discussed was to do updates per-table. This solution can introduce overhead and possible inconsistencies we should avoid (ex. package entries in the database created by solvers can be overwritten by packages detected using container image analyses).


Another solution is to let syncs happen on the programming level. In other words, we could keep our background jobs that copy data from staging environment to production running (document-sync-job). In that case, the job places documents on ceph in the production environment. A subsequent graph-sync job can sync these data into the database so that they are available in prod (even if they were computed in the staging environment). This approach seems to be scalable and might require less maintenance.

Additional Info: Epic: https://github.com/thoth-station/thoth-application/issues/2216

fridex commented 2 years ago

CC @Gregory-Pereira @harshad16

harshad16 commented 2 years ago

Thank you @fridex for the issue with all details

harshad16 commented 2 years ago

/label sig-devsecops

sesheta commented 2 years ago

@harshad16: The label(s) /label sig-devsecops cannot be applied. These labels are supported: community/discussion, community/group-programming, community/maintenance, community/question, deployment_name/ocp4-stage, deployment_name/ocp4-test, deployment_name/moc-prod, hacktoberfest, hacktoberfest-accepted, kind/cleanup, kind/demo, kind/deprecation, kind/documentation, kind/question, sig/advisor, sig/build, sig/cyborgs, sig/devops, sig/documentation, sig/indicators, sig/investigator, sig/knowledge-graph, sig/slo, sig/solvers, thoth/group-programming, thoth/human-intervention-required, thoth/potential-observation, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, triage/accepted, triage/duplicate, triage/needs-information, triage/not-reproducible, triage/unresolved, lifecycle/submission-accepted, lifecycle/submission-rejected

In response to [this](https://github.com/thoth-station/thoth-application/issues/2513#issuecomment-1097055547): >/label sig-devsecops Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
harshad16 commented 2 years ago

/label sig/devops /triage accepted

sesheta commented 2 years ago

@harshad16: The label(s) sig/devops cannot be applied, because the repository doesn't have them.

In response to [this](https://github.com/thoth-station/thoth-application/issues/2513#issuecomment-1097056112): >/label sig/devops >/triage accepted Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
Gregory-Pereira commented 2 years ago
* [ ]  make sure the sync-job uses the adjusted method and we can turn the job into a cronworkflow that periodically syncs data into the database (multiple sync jobs can be part of the cronworkflow to support parallel syncs)

I think Maya's current PR should update everything in such a way in storages that the only changes needed in the sync-job are to bump the version for storages once it goes through, and construct a CronWorkflow from the CronJob template in openshift.yaml.