okfn-brasil / jarbas

🎩 API for information and suspicions about reimbursements by Brazilian congresspeople
https://jarbas.serenata.ai/
296 stars 61 forks source link

Automate deploy (and, eventually, provision) #12

Closed cuducos closed 7 years ago

cuducos commented 7 years ago

The Digital Ocean droplet was set manually and the only piece of software that makes deploy easier is a git hook. Yet, nginx and gunicorn processes are owned by root — that's not cool.

Something like Ansible might help.


Personally I don't have devops know-how to fix the users and processes issues, and I've never used Ansible — so any help is appreciated. Surely I can share the application specific logic and help anyone willing to close that issue ; )

luiz-simples commented 7 years ago

Velho, posso auxiliar você a montar deploy de ambientes complexos usando Docker. É possível manipular nginx e outros processos que precisam de root, utilizando containers do Docker. Sem precisar ter super acesso da VPS na DigitalOcean.

Segue um flow parecido com o que venho montando: http://blog-assets.risingstack.com/2015/01/codeship_ansible_docker.png

cuducos commented 7 years ago

Great! I have little experience with Docker but with a tutor I guess I can use it properly. When are you available to talk about it?


Boa! Eu tenho pouca experiência com Docker, mas com um tutor eu acho que eu consigo usar certinho. Quando tu pode conversar sobre isso?

luiz-simples commented 7 years ago

Me passa o seu telefone ou liga pra mim. (48) 8810-8161 A ideia é Dockerrizar o seu projeto e automatizar o deploy via containers, fazendo o build e criando um container com a versão final de produção do seu projeto, seja ele fullstrack ou SPA. Possível até versionar na nuvem esses containers, facilitando assim rollbacks nos deploys quando necessário. Abraços.

cuducos commented 7 years ago

Sure thing — I'm gonna drop you a line at your Gmail. I've played around with Docker, so I get the idea. I just don't know how to move from the theory to practice. Anyway, thanks for the help, gonna follow up via email.

vitallan commented 7 years ago

How this one goes? The idea is to create a dockerfile and use travis to build image and deploy the application?

gwmoura commented 7 years ago

@vitallan, I think yes. The ideia is delivery the app in a container, we can any CI service to build a image, save on registry and after deploy the app.

gwmoura commented 7 years ago

@cuducos can I help you too? My Gmail is gwmoura@gmail.com

cuducos commented 7 years ago

Many thanks, @vitallan and @gwmoura for jumping in!

This Tuesday I was in a call with @luiz-simples and he's probably working on a branch to fix this issue.

In baby steps probably we'll start just with Docker for a dev environment to later integrate with CI (Travis is already in use for tests here, so probably we'll stick to Travis) and automate the deploy.

Feel free to talk to @luiz-simples and you guys can find the best way to help him.

I'm gonna to introduce you all via email. And count on me for doubts on the repo/app, on pairing etc.

ayr-ton commented 7 years ago

@vitallan @gwmoura @cuducos May I help? (: I could help with the provision scripts of the Droplet, creating the ansible scripts to deploy the docker containers Luiz want to create.

cuducos commented 7 years ago

I could help with the provision scripts of the Droplet, creating the ansible scripts to deploy the docker containers Luiz want to create.

Looks good to me. I'd love to learn bits of Ansible — it's in my TODO list anyway…

pedrommone commented 7 years ago

How this architecture must be? Something like nginx, python-cgi and some database?

We dont need build all the stuff, thats why Docker is magic! :)

pedrommone commented 7 years ago

@luiz-simples you'll need something to orchestrate the containers, main it hard coded is a waste of time.

In my opinion, something like Rancher (hosted) or AWS EC2 (PaaS) is alot better (I can help in both).

cuducos commented 7 years ago

@pedrommone many thanks!

How this architecture must be?

I have almost no experience with Docker, but I'll try to help, please follow up if what I'm saying is non-sense. This install steps (i.e. everything we need to do to get stuff running) is detailed in the README.md, so what I imagine is:

Is this of any help?

In my opinion, something like Rancher (hosted) or AWS EC2 (PaaS) is alot better (I can help in both).

We used to use a PaaS but now we are powered (supported, sponsored) by Digital Ocean, so we are deploying to a Digital Ocean droplet, is that ok?

gwmoura commented 7 years ago

@luiz-simples started a branch with Docker env - https://github.com/luiz-simples/jarbas/tree/docker-environment. Looking the branch is good for me, we gonna wait the PR for test in dev and after we can work the deploy on Digital Ocean.

pedrommone commented 7 years ago

@cuducos thanks for the awesome info.

About the DO, is totally fine. I can help you with that and make something working well. As soon I get some time work on the containers.

I will trace the following steps

What do you guys think?

cuducos commented 7 years ago

Two minor doubts:

  1. As a newbie to Docker world I haven't heard of Rancher before — what problem does it target? I watched the welcome video but probably I miss the limitations of the environment to understand the solution Rancher offers (I mean, in my naïve point of view, docker-compose.yml would wire up the infrastructure; so maybe the idea is to simplify deploying to different environments, but then… again… I have no idea how it's done without Rancher because I'm a newbie hahaha).
  2. What do you all think is better: to split backend and front-end before we work on the Docker stuff, to split while we create this new Docker infrastructure, or to get it going with Docker and then split?
pedrommone commented 7 years ago

@cuducos here we go:

  1. Rancher lets you orchestrate your containers, each project is called stack and it let you handle everything via API, it just rocks! You can see more here.
  2. The promise of docker is to separate each process into container, one for postgress, one for nginx, one for python.
pedrommone commented 7 years ago

Here is an exemple of two services

Rancher Example

gwmoura commented 7 years ago

@pedrommone this steps are this same, for me is good option, I don't work with hancher but I listen good opinions :smile:. @luiz-simples is working in create a base image for the project e images for dev and production, when possible looks the branch.

@cuducos

What do you all think is better: to split backend and front-end before we work on the Docker stuff, to split while we create this new Docker infrastructure, or to get it going with Docker and then split?

I think better start creating a Docker env for development to help more people to use and contribute with the project and after we split the project in backend and front-end

pedrommone commented 7 years ago

@gwmoura you can easily make everything up with docker-compose.yml. In my opinion, make dev and prod environment separately is a waste of time since everything needs to walk together.

gwmoura commented 7 years ago

I think better too, I prefer create everything on docker-compose.yml, as talk in 12factor.net - prod and dev should be the same. @luiz-simples is using docker-compose.yml and created some scripts to help and automate the deploy.

I am a developer, then I don't know if creating a simple container can impact negative in production... we can work together and create a simple and efficient environment

pedrommone commented 7 years ago

Simple containers are the "same" as VMs. If you split the services with containers you can benefit of all Docker official images. I'll write something and show you.

gwmoura commented 7 years ago

@pedrommone I know, you don't need write :smile:. Looks the branch - https://github.com/luiz-simples/jarbas/tree/docker-environment and comments what you think we need remove

pedrommone commented 7 years ago

@cuducos witch python and django version we are using?

cuducos commented 7 years ago

Python 3.5.2, Django is the latest version (1.10.something specified at requirements-dev.txt)

ayr-ton commented 7 years ago

Now that the Dockerfile was built, the next step is to create the Ansible role (or something similar to ansible) to automatically make the provision of the droplet? Does this correct?

cuducos commented 7 years ago

I must say I'm kind of lost here. I guess @pedrommone got stuck at the issue with the CI services — but honestly I could not understand this part because my knowledge on Docker environment is null.

As my idea was to have a working Dockerfile and docker-compose.yml locally I got that having Elm (NodeJS) being called from within Django was an issue, thus I fixed it in a separated branch. I also volunteered to change Travis CI for a different service, but no one could me point to a specific service yet.

@pedrommone can you update the checkboxes and clarify what's holding us? Just asking because maybe more people is willing to jump in and help here — please, don't get me wrong, no pressure, ok?

pedrommone commented 7 years ago

@cuducos dont worrry :).

I've stuck into dependencies, we cannot have nodejs and python on the same image (thats not the Docker goal). When I get some time I'll work on that.

gomex commented 7 years ago

Hey!

I wanna help too! I know a little bit about docker and ansible. I will study the infra of datasciencebr.

IMHO we don't need to think about rancher or something like that yet. We can use some service to provide this infra for us: https://hyper.sh/ (for example! I didn't tested yet!).

cuducos commented 7 years ago

Hi @gomex — I was about to read you book to try to fix that haha… well I'm gonna read it anyway. The point is that here we do not necessarily need Docker, although I'm pretty sure Docker could help.

I'm mostly a developer, and I'm pretty much familiar with Jarbas — count on me if you have any doubt about the stack (BTW taking about Docker it might be interesting to use the extract-nodejs branch — there compiling assets is done on NodeJS CLI, not within Django pipeline; I'll update that branch now because it was left behind a couple of weeks ago). Unfortunately I'm a newbie in devops and Docker. Therefore I do appreciate your help ; )

gomex commented 7 years ago

@cuducos first thing, is possible pass the username and password for this command?

python manage.py createsuperuser

Can we specify it using environment variable?

cuducos commented 7 years ago

Hi @gomex — basically we can skip that. It's optional and we don't have a lot of heavy stuff to do on Django Admin. People who would use it will find their way to do it. Yet there's no way to pass the password (the best we can do is set a super user without password with --noinput — which I wouldn't recommend, too unsafe).

gomex commented 7 years ago

I am trying to use SQLite to avoid Postgres configuration for now and I got this error:

ValueError: SQLite backend does not support timezone-aware datetimes when USE_TZ is False.

I tried to add USE_TZ=True inside .env and didn't change that error :(

cuducos commented 7 years ago

I'm sorry about that. Uncaught error probably. Gonna take a look later today and update you here. On Wed, 9 Nov 2016 at 12:16 Rafael Gomes notifications@github.com wrote:

I am trying to use SQLite to avoid Postgres configuration for now and I got this error:

ValueError: SQLite backend does not support timezone-aware datetimes when USE_TZ is False.

I tried to add USE_TZ=True inside .env and didn't change that error :(

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datasciencebr/jarbas/issues/12#issuecomment-259424174, or mute the thread https://github.com/notifications/unsubscribe-auth/AEg38_7L6jMtmYtef1JU_yAo6aL3gzgFks5q8dWtgaJpZM4KDLVr .

gomex commented 7 years ago

You can check using my Dockerfile on that repos:

https://github.com/gomex/jarbas

You can reproduce using this command:

docker build -t serenata-jarbas .

PS: Remember do use the same branch "extract-nodejs"

Thanks!

cuducos commented 7 years ago

Hi @gomex, bug caught and squashed in 17d1f7f (within extract-nodejs branch)— I tested it here with SQLite and it seems do be working.

gomex commented 7 years ago

Is there any way to build a "small" version of this data? When I try "python manage.py loaddatasets" and "python manage.py loadsuppliers" It takes too long.

cuducos commented 7 years ago

Not in in a seamless way. You (I mean, I can create and share you) a small a small version of the datasets and you save it locally. Once it's there you call these commands with --source path/to/small?verions/directory — does that work?

gomex commented 7 years ago

@cuducos let's try :)

cuducos commented 7 years ago

Here it is: download the files and point both scripts to this folder with -s https://www.dropbox.com/sh/cxuizj33rhb4c1n/AAAoT0qk0GGO1Fef-bf2Jj0aa?dl=0

gomex commented 7 years ago

I tried that command:

python manage.py loaddatasets --source ./datasets

I got this error:

Starting with 0 documents
Loading ./datasets/2016-08-08-current-year.xz
Traceback (most recent call last):
  File "manage.py", line 22, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.5/site-packages/django/core/management/__init__.py", line 367, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.5/site-packages/django/core/management/__init__.py", line 359, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.5/site-packages/django/core/management/base.py", line 294, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.5/site-packages/django/core/management/base.py", line 345, in execute
    output = self.handle(*args, **options)
  File "/code/jarbas/core/management/commands/loaddatasets.py", line 30, in handle
    self.bulk_create_by(documents, options['batch_size'])
  File "/code/jarbas/core/management/commands/loaddatasets.py", line 87, in bulk_create_by
    for document in documents:
  File "/code/jarbas/core/management/commands/loaddatasets.py", line 40, in documents_from
    yield Document(**self.serialize(row))
  File "/usr/local/lib/python3.5/site-packages/django/db/models/base.py", line 555, in __init__
    raise TypeError("'%s' is an invalid keyword argument for this function" % list(kwargs)[0])
TypeError: '' is an invalid keyword argument for this function
cuducos commented 7 years ago

I'm sorry about that, @gomex. I couldn't get back to you earlier.

Anyway I was pondering here: these commands are basically a step of provision, of seeding the database. To test if the environment is set properly and if the app is working you can just skip them and run the tests ($ python manage.py test). If they pass and the equivalent of http://localhost gives you a HTTP 200 I guess you're good to go.

Meanwhile I'll try to understand why the smaller datasets haven't worked and update you all soon (I hope so)

cuducos commented 7 years ago

@gomex, a few comments after testing your Dockerfile — probably newbie doubts, but anyway:

cuducos commented 7 years ago

I forgot: we need the --noinput argument to make collectstatic override user input (i.e. $ python manage.py collectstatic --noinput)

gomex commented 7 years ago

@cuducos about your questions:

This branch (extract-nodejs) compiles Elm files (to .js) outside Django, thus instead of $ python manage.py assets build we need a $ npm run assets — and this might need a different container I guess (but all in all, if we go back to master and use $ python manage.py assets build Django will look for NodeJS bin anyway). Is it time to move to Docker Compose already?

I was trying to start from Python, but the ideia is use another image to build this elm assets. The proposal is build inside onde specific folder that should be mount as volume in python container too.

Python container will access that folder and can see elm assets there.

Python container will show me something without elm assets? if not, what folders should I mount in both container?

Python package psycopg2 will fail (i.e. $ python -m pip install -r requirements-dev.txt); it requires a client version of Postgres installed in the machine (e.g. apt-get postgresql postgresql-contrib); if we're following a baby steps strategy, we can comment it out for a while as you told me we're using SQLite to get started. Not sure about how to fix that issue in a container mindset. Any ideas?

We can install postgresql inside python container for now.

By now I'm just skipping $ python manage.py loaddatasets and $ python manage.py loadsuppliers — once I can reach the Django in a container, I move forward to seeding the database. Does that make sense?

Make sense for me :)

gomex commented 7 years ago

Hey @cuducos

I commented this commands:

RUN python manage.py loaddatasets --source ./datasets

RUN python manage.py loadsuppliers

RUN python manage.py ceapdatasets

RUN python manage.py assets build

RUN python manage.py collectstatic

And try to run "python manage.py test" and I got this error:

Creating test database for alias 'default'...
/usr/local/lib/python3.5/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Document.issue_date received a naive datetime (1970-01-01 00:00:00) while time zone support is active.
  RuntimeWarning)
......./usr/local/lib/python3.5/site-packages/django/db/models/fields/__init__.py:1430: RuntimeWarning: DateTimeField Supplier.last_updated received a naive datetime (2016-11-11 01:17:01.431206) while time zone support is active.
  RuntimeWarning)
.................................FF............
======================================================================
FAIL: test_serializer (jarbas.core.tests.test_loadsuppliers_command.TestSerializer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/code/jarbas/core/tests/test_loadsuppliers_command.py", line 51, in test_serializer
    self.assertEqual(self.command.serialize(supplier), expected)
AssertionError: {'lon[37 chars]te': datetime.date(1969, 12, 31), 'opening': d[104 chars] 31)} != {'lon[37 chars]te': '1969-12-31', 'opening': '1969-12-31', 'l[59 chars]-31'}
  {'email': None,
   'latitude': 3.1415,
   'longitude': -42.0,
-  'opening': datetime.date(1969, 12, 31),
-  'situation_date': datetime.date(1969, 12, 31),
+  'opening': '1969-12-31',
+  'situation_date': '1969-12-31',
-  'special_situation_date': datetime.date(1969, 12, 31)}
?                            ^^^^^^^^^^^^^^    ^^  ^^  ^

+  'special_situation_date': '1969-12-31'}
?                            ^    ^  ^  ^

======================================================================
FAIL: test_to_date (jarbas.core.tests.test_loadsuppliers_command.TestSerializer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/code/jarbas/core/tests/test_loadsuppliers_command.py", line 30, in test_to_date
    self.assertEqual(self.command.to_date('22/7/91'), expected)
AssertionError: datetime.date(1991, 7, 22) != '1991-07-22'

----------------------------------------------------------------------
Ran 54 tests in 0.497s

FAILED (failures=2)
Destroying test database for alias 'default'...
cuducos commented 7 years ago

Indeed. When we changed USE_TZ to True that started to happen. Fixed in 67069d7. I'm fully available today, @gomex — if you want to pair to speed things up, I'm up to ; )

Addressing your questions (let me know if accidentally I leave any of them behind):

Python container will show me something without elm assets?

Running the app without the NodeJS part will not show anything in the browser except the proper title (i.e. <title>Jarbas | Serenata de Amor</title>) with in the HTML the <head>. If we get to that point, we're doing things right Python wise. Added some tests for that in 1a083ca .

If not, what folders should I mount in both container?

Django collect all static files in a external directory (useful to serve them via nginx, for example). I believe the Elm files could compile to that same dir. But this is just an idea. We could use a different dir and add some lines to nginx conf (i.e. having /static/ for Django collected static files and /elm/ for Elm assets).

We can install postgresql inside python container for now.

I added RUN apt-get install -y postgresql postgresql-contrib but got:

E: Unable to locate package postgresql
E: Unable to locate package postgresql-contrib

Not sure what to do… any clues?

gomex commented 7 years ago

@cuducos we can pair, but I am hospitalized now. I will do a surgery on my collarbone today. Can we try to do that in few days?

Thanks!

gomex commented 7 years ago

About your questions:

I added RUN apt-get install -y postgresql postgresql-contrib but got:

E: Unable to locate package postgresql
E: Unable to locate package postgresql-contrib

You need to update before install:

RUN apt-get update && apt-get install -y postgresql postgresql-contrib

Try that!

cuducos commented 7 years ago

OMG I had no clue you were in the hospital. No worries, mate! I'm going through a surgery later this month too. Get well and we talk later ; )

However, let me tell you the good news:

$ docker run jarbas python manage.py test                                        [jarbas]cuducos@thomas
.........................................................
----------------------------------------------------------------------
Ran 57 tests in 1.024s

OK

Yay 🎉

Here it is the Dockerfile I used:

FROM python:3.5
COPY requirements-dev.txt /requirements-dev.txt
RUN python -m pip install -r requirements-dev.txt
COPY ./ /code
WORKDIR /code
RUN apt-get update && apt-get install -y postgresql postgresql-contrib
RUN python manage.py migrate
# RUN python manage.py createsuperuser
# RUN python manage.py loaddatasets
# RUN python manage.py loadsuppliers
# RUN python manage.py ceapdatasets
# RUN python manage.py assets build
# RUN python manage.py collectstatic