Open OriHoch opened 8 years ago
Each environment should have different roles
Currently, we have 2 known roles:
The DB and any other external resources should not be part of this instance, they should be connected using the instance parameters.
Is Oknesset Using AWS? then db/memcache/redis should be using amazon services, RDS for postgres ELASTICCACHE for redis/memcache etc. Using user data is actually an anti pattern (mostly manual in AWS). I believe (and practice.. ) it's better to deploy an .envs file as part of the deployment process, source it in .bashrc and put ENVIRONMENT variables there and/or in the supervisord.conf (which should also be part of the deployment)
currently open knesset is using amazon EC2 instances + ELB (load balancer)
I'm used to working with user data and launch configurations, but I'm open to any other implementation..
Also, if you want to migrate away from AWS, that's also an option
We're on AWS but we don't use RDS. The primary DB server also runs the cron jobs, including the presence code which needs to be rewritten at some point.
While it was free, we used Elasticache, but then we moved back to having memcached on the DB server.
We don't use supervisor because we didn't feel we need it yet, but it would be great to have it if you'd like to work on it. Ideally we'd dockerise the web server, which might make it easier to hack on oknesset.
On Sun, Feb 14, 2016 at 9:19 AM, Alonisser notifications@github.com wrote:
Is Oknesset Using AWS? then db/memcache/redis should be using amazon services, RDS for postgres ELASTICCACHE for redis/memcache etc. Using user data is actually an anti pattern (mostly manual in AWS). I believe (and practice.. ) it's better to deploy an .envs file as part of the deployment process, source it in .bashrc and put ENVIRONMENT variables there and/or in the supervisord.conf (which should also be part of the deployment)
— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183841664 .
I just noticed dockerisation is the issue. Anyhow, we started working on a dockerised DB server at one of the hackathons. I dropped it when I couldn't figure out how to give postgres a non-docker directory for the DB files.
I'm for using Amazon whatever possible, while dockerisation of postgres is possible I would not do this on the production environment and rather have an optimized db instance and configuration and not hack a postgres instance my self. Supervisor is easy, I can do a supervisor file if preferred
Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469
On Sun, Feb 14, 2016 at 1:29 PM, Amir More notifications@github.com wrote:
I just noticed dockerisation is the issue. Anyhow, we started working on a dockerised DB server at one of the hackathons. I dropped it when I couldn't figure out how to give postgres a non-docker directory for the DB files.
— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183875746 .
If you've got a solution for the cron jobs you'll have my vote to move to RDS.
They can run on a micro instance.. Or utilize [data pipeline[( http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/what-is-datapipeline.html) Didn't try that my self but looks like the right direction I can research that
Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469
On Sun, Feb 14, 2016 at 2:19 PM, Amir More notifications@github.com wrote:
If you've got a solution for the cron jobs you'll have my vote to move to RDS.
— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183880956 .
I suggest you take a look at what the cron jobs do. I'll add the current cron from production as a comment
On Sun, Feb 14, 2016 at 2:35 PM, Alonisser notifications@github.com wrote:
They can run on a micro instance.. Or utilize [data pipeline[(
http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/what-is-datapipeline.html ) Didn't try that my self but looks like the right direction I can research that
Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469
On Sun, Feb 14, 2016 at 2:19 PM, Amir More notifications@github.com wrote:
If you've got a solution for the cron jobs you'll have my vote to move to RDS.
— Reply to this email directly or view it on GitHub < https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183880956>
.
— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183882165 .
Updated crontab is at https://github.com/hasadna/Open-Knesset/blob/master/deploy/crontab.txt
@habeanf looks like what I suggested could work. working example using ecs + data pipelines here: working example
Also, AWS Data Pipeline looks interesting but take into consideration the following: a. The sadna in general would like to move to Google Cloud Platform, so we wouldn't want to be tied to AWS-specific infrastructure b. It looks like we have to pay for all these jobs, of which we have ~10 "low frequency" and 4-5 "high frequency".. That will add another ~$10 a month..
So a micro instance + RDS is also out of consideration (since RDS is a specific AWS offering) ?
Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469
On Sun, Feb 14, 2016 at 3:22 PM, Amir More notifications@github.com wrote:
Also, AWS Data Pipeline looks interesting but take into consideration the following: a. The sadna in general would like to move to Google Cloud Platform, so we wouldn't want to be tied to AWS-specific infrastructure b. It looks like we have to pay for all these jobs, of which we have ~10 "low frequency" and 4-5 "high frequency".. That will add another ~$10 a month..
— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183888311 .
RDS is a problem in that sense, yes. Also, RDS is more expensive, especially for Multi-AZ deployments. For fairness, we're currently using an m1.small for master DB and a t2.small for slave but replication broke and it was never reset. Ideally we'd have something like 2x m1.small. I think I checked and the difference price is almost 2x. We're in the EU Ireland region without reserved instances. Part of moving to Google should reduce our pricing due to their model, without requiring up-front payments.
I checked prices, 2x m1.small would be 0.094 / h while RDS would be 0.128 (Postgres, Multi-AZ, EU Ireland). That works out to $25 a month. Also the primary DB does the cron jobs, so no need for a micro either. But I do agree its quite ad-hoc and a systematic solution is better; case in point being replication is not working at the moment.
No replication 😲 .. ok so RDS is currently not an option But we can setup a db in google cloud and just use it from AWS cloud..
Twitter:@alonisser https://twitter.com/alonisser LinkedIn Profile http://www.linkedin.com/in/alonisser Facebook https://www.facebook.com/alonisser _Tech blog:_4p-tech.co.il/blog _Personal Blog:_degeladom.wordpress.com Tel:972-54-6734469
On Sun, Feb 14, 2016 at 3:42 PM, Amir More notifications@github.com wrote:
I checked prices, 2x m1.small would be 0.094 / h while RDS would be 0.128 (Postgres, Multi-AZ, EU Ireland). That works out to $25 a month. Also the primary DB does the cron jobs, so no need for a micro either. But I do agree its quite ad-hoc and systematic solution is better; case in point being replication is not working at the moment.
— Reply to this email directly or view it on GitHub https://github.com/hasadna/Open-Knesset/issues/527#issuecomment-183890485 .
Well once the web server is dockerised we can do everything from google cloud..
And no, no replication. There used to be replication but it broke and I never reset it :flushed: However we do have a daily backup to S3 which I check and use periodically, so its bad but not that bad.
Re: using a micro instance to run the jobs. One interesting approach could be to have that instance start up automatically on demand, like in the link you provided. It might require more than meets the eye because its important we get the logs from the runs, especially if scraping is failing.
Each release of Open Knesset knesset has a tag on GitHub, for example: https://github.com/hasadna/Open-Knesset/releases/tag/v2.9.0
I would like to be able to deploy a specific version to an environment.
Currently, we have only 1, production environment, but I would like to support any number of different environments.
Each environment should have all it's settings provided externally (e.g. using environment variables), so the instance is generic (for example, an AMI file) - which gets all it's settings external (e.g. from Amazon Launch Configuration User Data)
Ideally we would also have a dashboard which allows to manage all the instances, do the deployments, see monitoring of all instances, all the logs, etc..