data persistance solution - btrfs snapshots to s3

GoodDollar / GoodGunReceiver

Example usage of gun-receiver

2 stars 0 forks source link

data persistance solution - btrfs snapshots to s3 #17

Open AntonSemenenko opened 5 years ago

AntonSemenenko commented 5 years ago

This should be solved using backup. The scaling group should provide redundancy while each member can backup his content to S3 periodically.

Backup options

dump lmdb to file periodically and then use incremental backup to S3 (ie restic)
prefered option: write through to S3 directly from gundb. need to check performance

AntonSemenenko commented 5 years ago

For data persistence: We create Elastic Block Store volume for storing DBs data and configure the automatic snapshots for this volume. Then we create one EC2 instance separate from Elastic Beanstalk, connect it to the EBS volume and deploy there DB that will be utilized for the sync with the rest DBs and data backup purposes. After that, instances created by Elastic Beanstalk will sync through peer discovery with a backup instance. (requires approval)

AntonSemenenko commented 5 years ago

The persistence workflow:

1)Create Elastic Block Store volume for storing DBs data and configure the automatic snapshots for this volume. 2)Create one EC2 instance separately from Elastic Beanstalk, connect it to the EBS volume 3)Deploy DB on the separate instance that will be utilized for the sync with the rest DBs and data backup purposes. 4) Configure Travis for auto-deployment 5)Instances created by Elastic Beanstalk will sync through peer discovery with a backup instance. 6) Check whether the sync between instances works well

Approximate timeline for implementation - 10-12h. Depends on how the sync will act after all steps, it is possible that additional work will be required.

sirpy commented 5 years ago

Need clearer explanation on how new instances use the backup. Estimate per subtask

AntonSemenenko commented 5 years ago

Need clearer explanation on how new instances use the backup. Estimate per subtask

1) Create Elastic Block Store volume for storing DBs data and configure the automatic snapshots for this volume. - 2h 2) Create one EC2 instance separately from Elastic Beanstalk, connect it to the EBS volume. - 2h 3) Configure Travis for auto-deployment to new instance. - 4-6h 4) Check whether the sync between instances works well. - 2h

We will tag backup instance, so all instances in the autoscale group can find it thought aws sdk and connect to it after start and sync with it as gundb peers.

Speaking of the estimation per subtask, such time of the evaluation isn't accurate enough, we are decomposing it step by step in order to make our workflow clear, however, it is difficult to predict how much time exactly will be spent on each action, moreover when the total time is 12 hours, that is why we are estimating it comprehensively.

sirpy commented 5 years ago

putting on hold. solution need to be based on

using instance local storage
formatting instance storage with snapshot capable such as btrfs/lvm
taking hourly incremental snapshots
new instance can pick any snapshot available

AntonSemenenko commented 5 years ago

Please specify your meaning of the local storage to avoid misunderstandings
https://www.palepurple.co.uk/filesystem-magic-aws-ebs-volumes-btrfs
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html
New instance will provision new volume and will provision it for own needs. If you mean to restore state of the system snapshot - we would need to investigate that and most likely this will require manual actions and restart of the whole cluster.