hasadna / Open-Knesset

A project aimed at making the Israeli Knesset more transparent. Python and Django based
http://oknesset.org/
BSD 3-Clause "New" or "Revised" License
106 stars 175 forks source link

Continuous Deployment #527

Open OriHoch opened 8 years ago

OriHoch commented 8 years ago

Each release of Open Knesset knesset has a tag on GitHub, for example: https://github.com/hasadna/Open-Knesset/releases/tag/v2.9.0

I would like to be able to deploy a specific version to an environment.

Currently, we have only 1, production environment, but I would like to support any number of different environments.

Each environment should have all it's settings provided externally (e.g. using environment variables), so the instance is generic (for example, an AMI file) - which gets all it's settings external (e.g. from Amazon Launch Configuration User Data)

Ideally we would also have a dashboard which allows to manage all the instances, do the deployments, see monitoring of all instances, all the logs, etc..

OriHoch commented 8 years ago

Each environment should have different roles

Currently, we have 2 known roles:

OriHoch commented 8 years ago

The DB and any other external resources should not be part of this instance, they should be connected using the instance parameters.

alonisser commented 8 years ago

Is Oknesset Using AWS? then db/memcache/redis should be using amazon services, RDS for postgres ELASTICCACHE for redis/memcache etc. Using user data is actually an anti pattern (mostly manual in AWS). I believe (and practice.. ) it's better to deploy an .envs file as part of the deployment process, source it in .bashrc and put ENVIRONMENT variables there and/or in the supervisord.conf (which should also be part of the deployment)

OriHoch commented 8 years ago

currently open knesset is using amazon EC2 instances + ELB (load balancer)

I'm used to working with user data and launch configurations, but I'm open to any other implementation..

Also, if you want to migrate away from AWS, that's also an option

habeanf commented 8 years ago

We're on AWS but we don't use RDS. The primary DB server also runs the cron jobs, including the presence code which needs to be rewritten at some point.

While it was free, we used Elasticache, but then we moved back to having memcached on the DB server.

We don't use supervisor because we didn't feel we need it yet, but it would be great to have it if you'd like to work on it. Ideally we'd dockerise the web server, which might make it easier to hack on oknesset.

On Sun, Feb 14, 2016 at 9:19 AM, Alonisser notifications@github.com wrote:

Is Oknesset Using AWS? then db/memcache/redis should be using amazon services, RDS for postgres ELASTICCACHE for redis/memcache etc. Using user data is actually an anti pattern (mostly manual in AWS). I believe (and practice.. ) it's better to deploy an .envs file as part of the deployment process, source it in .bashrc and put ENVIRONMENT variables there and/or in the supervisord.conf (which should also be part of the deployment)

— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183841664 .

habeanf commented 8 years ago

I just noticed dockerisation is the issue. Anyhow, we started working on a dockerised DB server at one of the hackathons. I dropped it when I couldn't figure out how to give postgres a non-docker directory for the DB files.

alonisser commented 8 years ago

I'm for using Amazon whatever possible, while dockerisation of postgres is possible I would not do this on the production environment and rather have an optimized db instance and configuration and not hack a postgres instance my self. Supervisor is easy, I can do a supervisor file if preferred

Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469

On Sun, Feb 14, 2016 at 1:29 PM, Amir More notifications@github.com wrote:

I just noticed dockerisation is the issue. Anyhow, we started working on a dockerised DB server at one of the hackathons. I dropped it when I couldn't figure out how to give postgres a non-docker directory for the DB files.

— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183875746 .

habeanf commented 8 years ago

If you've got a solution for the cron jobs you'll have my vote to move to RDS.

alonisser commented 8 years ago

They can run on a micro instance.. Or utilize [data pipeline[( http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/what-is-datapipeline.html) Didn't try that my self but looks like the right direction I can research that

Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469

On Sun, Feb 14, 2016 at 2:19 PM, Amir More notifications@github.com wrote:

If you've got a solution for the cron jobs you'll have my vote to move to RDS.

— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183880956 .

habeanf commented 8 years ago

I suggest you take a look at what the cron jobs do. I'll add the current cron from production as a comment

On Sun, Feb 14, 2016 at 2:35 PM, Alonisser notifications@github.com wrote:

They can run on a micro instance.. Or utilize [data pipeline[(

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/what-is-datapipeline.html ) Didn't try that my self but looks like the right direction I can research that

Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469

On Sun, Feb 14, 2016 at 2:19 PM, Amir More notifications@github.com wrote:

If you've got a solution for the cron jobs you'll have my vote to move to RDS.

— Reply to this email directly or view it on GitHub < https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183880956>

.

— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183882165 .

OriHoch commented 8 years ago

Updated crontab is at https://github.com/hasadna/Open-Knesset/blob/master/deploy/crontab.txt

alonisser commented 8 years ago

@habeanf looks like what I suggested could work. working example using ecs + data pipelines here: working example

habeanf commented 8 years ago

Also, AWS Data Pipeline looks interesting but take into consideration the following: a. The sadna in general would like to move to Google Cloud Platform, so we wouldn't want to be tied to AWS-specific infrastructure b. It looks like we have to pay for all these jobs, of which we have ~10 "low frequency" and 4-5 "high frequency".. That will add another ~$10 a month..

alonisser commented 8 years ago

So a micro instance + RDS is also out of consideration (since RDS is a specific AWS offering) ?

Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469

On Sun, Feb 14, 2016 at 3:22 PM, Amir More notifications@github.com wrote:

Also, AWS Data Pipeline looks interesting but take into consideration the following: a. The sadna in general would like to move to Google Cloud Platform, so we wouldn't want to be tied to AWS-specific infrastructure b. It looks like we have to pay for all these jobs, of which we have ~10 "low frequency" and 4-5 "high frequency".. That will add another ~$10 a month..

— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183888311 .

habeanf commented 8 years ago

RDS is a problem in that sense, yes. Also, RDS is more expensive, especially for Multi-AZ deployments. For fairness, we're currently using an m1.small for master DB and a t2.small for slave but replication broke and it was never reset. Ideally we'd have something like 2x m1.small. I think I checked and the difference price is almost 2x. We're in the EU Ireland region without reserved instances. Part of moving to Google should reduce our pricing due to their model, without requiring up-front payments.

habeanf commented 8 years ago

I checked prices, 2x m1.small would be 0.094 / h while RDS would be 0.128 (Postgres, Multi-AZ, EU Ireland). That works out to $25 a month. Also the primary DB does the cron jobs, so no need for a micro either. But I do agree its quite ad-hoc and a systematic solution is better; case in point being replication is not working at the moment.

alonisser commented 8 years ago

No replication 😲 .. ok so RDS is currently not an option But we can setup a db in google cloud and just use it from AWS cloud..

Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469

On Sun, Feb 14, 2016 at 3:42 PM, Amir More notifications@github.com wrote:

I checked prices, 2x m1.small would be 0.094 / h while RDS would be 0.128 (Postgres, Multi-AZ, EU Ireland). That works out to $25 a month. Also the primary DB does the cron jobs, so no need for a micro either. But I do agree its quite ad-hoc and systematic solution is better; case in point being replication is not working at the moment.

— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183890485 .

habeanf commented 8 years ago

Well once the web server is dockerised we can do everything from google cloud..

habeanf commented 8 years ago

And no, no replication. There used to be replication but it broke and I never reset it :flushed: However we do have a daily backup to S3 which I check and use periodically, so its bad but not that bad.

habeanf commented 8 years ago

Re: using a micro instance to run the jobs. One interesting approach could be to have that instance start up automatically on demand, like in the link you provided. It might require more than meets the eye because its important we get the logs from the runs, especially if scraping is failing.