TaruDesigns / turinsights_backend

0 stars 0 forks source link

TurInsights

A CRUD project that takes UIPath Orchestrator API and saves it in a local DB (Webhooks planned)

This project depends on services as part of the full stack. This repo ONLY includes the backend+celeryworker code

Backend Requirements

Backend local development

docker-compose build backend

Frontend, built with Docker, with routes handled based on the path: http://localhost

Backend, JSON based web API based on OpenAPI: http://localhost/api/

Automatic interactive documentation with Swagger UI (from the OpenAPI backend): http://localhost/docs

Alternative automatic documentation with ReDoc (from the OpenAPI backend): http://localhost/redoc

PGAdmin, PostgreSQL web administration: http://localhost:5050

Flower, administration of Celery tasks: http://localhost:5555

Traefik UI, to see how the routes are being handled by the proxy: http://localhost:8090

Note: The first time you start your stack, it might take a minute for it to be ready. While the backend waits for the database to be ready and configures everything. You can check the logs to monitor it.

If your Docker is not running in localhost (the URLs above wouldn't work) check the sections below on Development with Docker Toolbox and Development with a custom IP.

Backend local development, additional details

General workflow

By default, the dependencies are managed with Poetry, go there and install it.

From ./app/ you can install all the dependencies with:

$ poetry install

Then you can start a shell session with the new environment with:

$ poetry shell

Next, open your editor at ./app/ (instead of the project root: ./), so that you see an ./app/ directory with your code inside. That way, your editor will be able to find all the imports, etc. Make sure your editor uses the environment you just created with Poetry.

Modify or add SQLAlchemy models in ./app/app/models/, Pydantic schemas in ./app/app/schemas/, API endpoints in ./app/app/api/, CRUD (Create, Read, Update, Delete) utils in ./app/app/crud/. The easiest might be to copy the ones for Items (models, endpoints, and CRUD utils) and update them to your needs.

Add and modify tasks to the Celery worker in ./app/app/worker.py.

If you need to install any additional package to the worker, add it to the file ./app/celeryworker.dockerfile.

Docker Compose Override

During development, you can change Docker Compose settings that will only affect the local development environment, in the file docker-compose.override.yml.

The changes to that file only affect the local development environment, not the production environment. So, you can add "temporary" changes that help the development workflow.

For example, the directory with the backend code is mounted as a Docker "host volume", mapping the code you change live to the directory inside the container. That allows you to test your changes right away, without having to build the Docker image again. It should only be done during development, for production, you should build the Docker image with a recent version of the backend code. But during development, it allows you to iterate very fast. Have in mind that if you have a syntax error and save the Python file, it will break and exit, and the container will stop. After that, you can restart the container by fixing the error and running again:

$ docker-compose up -d

There is also a commented out command override, you can uncomment it and comment the default one. It makes the backend container run a process that does "nothing", but keeps the container alive. That allows you to get inside your running container and execute commands inside, for example a Python interpreter to test installed dependencies, or start the development server that reloads when it detects changes, or start a Jupyter Notebook session.

To get inside the container with a bash session you can start the stack with:

$ docker-compose up -d

and then exec inside the running container:

$ docker-compose exec backend bash

You should see an output like:

root@7f2607af31c3:/app#

that means that you are in a bash session inside your container, as a root user, under the /app directory.

Backend tests

To test the backend run:

$ DOMAIN=backend sh ./scripts/test.sh

The file ./scripts/test.sh has the commands to generate a testing docker-stack.yml file, start the stack and test it.

The tests run with Pytest, modify and add tests to ./app/app/tests/.

If you use GitLab CI the tests will run automatically.

Local tests

Start the stack with this command:

DOMAIN=backend sh ./scripts/test-local.sh

The ./app directory is mounted as a "host volume" inside the docker container (set in the file docker-compose.dev.volumes.yml). You can rerun the test on live code:

docker-compose exec backend /app/tests-start.sh

Test running stack

If your stack is already up and you just want to run the tests, you can use:

docker-compose exec backend /app/tests-start.sh

That /app/tests-start.sh script just calls pytest after making sure that the rest of the stack is running. If you need to pass extra arguments to pytest, you can pass them to that command and they will be forwarded.

For example, to stop on first error:

docker-compose exec backend bash /app/tests-start.sh -x

Test Coverage

Because the test scripts forward arguments to pytest, you can enable test coverage HTML report generation by passing --cov-report=html.

To run the local tests with coverage HTML reports:

DOMAIN=backend sh ./scripts/test-local.sh --cov-report=html

To run the tests in a running stack with coverage HTML reports:

docker-compose exec backend bash /app/tests-start.sh --cov-report=html

Live development with Python Jupyter Notebooks

If you know about Python Jupyter Notebooks, you can take advantage of them during local development.

The docker-compose.override.yml file sends a variable env with a value dev to the build process of the Docker image (during local development) and the Dockerfile has steps to then install and configure Jupyter inside your Docker container.

So, you can enter into the running Docker container:

docker-compose exec backend bash

And use the environment variable $JUPYTER to run a Jupyter Notebook with everything configured to listen on the public port (so that you can use it from your browser).

It will output something like:

root@73e0ec1f1ae6:/app# $JUPYTER
[I 12:02:09.975 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret
[I 12:02:10.317 NotebookApp] Serving notebooks from local directory: /app
[I 12:02:10.317 NotebookApp] The Jupyter Notebook is running at:
[I 12:02:10.317 NotebookApp] http://(73e0ec1f1ae6 or 127.0.0.1):8888/?token=f20939a41524d021fbfc62b31be8ea4dd9232913476f4397
[I 12:02:10.317 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 12:02:10.317 NotebookApp] No web browser found: could not locate runnable browser.
[C 12:02:10.317 NotebookApp]

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://(73e0ec1f1ae6 or 127.0.0.1):8888/?token=f20939a41524d021fbfc62b31be8ea4dd9232913476f4397

you can copy that URL and modify the "host" to be localhost or the domain you are using for development (e.g. local.dockertoolbox.tiangolo.com), in the case above, it would be, e.g.:

http://localhost:8888/token=f20939a41524d021fbfc62b31be8ea4dd9232913476f4397

and then open it in your browser.

You will have a full Jupyter Notebook running inside your container that has direct access to your database by the container name (db), etc. So, you can just run sections of your backend code directly, for example with VS Code Python Jupyter Interactive Window or Hydrogen.

Migrations

As during local development your app directory is mounted as a volume inside the container, you can also run the migrations with alembic commands inside the container and the migration code will be in your app directory (instead of being only inside the container). So you can add it to your git repository.

Make sure you create a "revision" of your models and that you "upgrade" your database with that revision every time you change them. As this is what will update the tables in your database. Otherwise, your application will have errors.

$ docker-compose exec backend bash
$ alembic revision --autogenerate -m "Add column last_name to User model"
$ alembic upgrade head

If you don't want to use migrations at all, uncomment the line in the file at ./app/app/db/init_db.py with:

Base.metadata.create_all(bind=engine)

and comment the line in the file prestart.sh that contains:

$ alembic upgrade head

If you don't want to start with the default models and want to remove them / modify them, from the beginning, without having any previous revision, you can remove the revision files (.py Python files) under ./app/alembic/versions/. And then create a first migration as described above.

Development with Docker Toolbox

If you are using Docker Toolbox in Windows or macOS instead of Docker for Windows or Docker for Mac, Docker will be running in a VirtualBox Virtual Machine, and it will have a local IP different than 127.0.0.1, which is the IP address for localhost in your machine.

The address of your Docker Toolbox virtual machine would probably be 192.168.99.100 (that is the default).

As this is a common case, the domain local.dockertoolbox.tiangolo.com points to that (private) IP, just to help with development (actually dockertoolbox.tiangolo.com and all its subdomains point to that IP). That way, you can start the stack in Docker Toolbox, and use that domain for development. You will be able to open that URL in Chrome and it will communicate with your local Docker Toolbox directly as if it was a cloud server, including CORS (Cross Origin Resource Sharing).

If you used the default CORS enabled domains while generating the project, local.dockertoolbox.tiangolo.com was configured to be allowed. If you didn't, you will need to add it to the list in the variable BACKEND_CORS_ORIGINS in the .env file.

To configure it in your stack, follow the section Change the development "domain" below, using the domain local.dockertoolbox.tiangolo.com.

After performing those steps you should be able to open: http://local.dockertoolbox.tiangolo.com and it will be server by your stack in your Docker Toolbox virtual machine.

Check all the corresponding available URLs in the section at the end.

Development in localhost with a custom domain

You might want to use something different than localhost as the domain. For example, if you are having problems with cookies that need a subdomain, and Chrome is not allowing you to use localhost.

In that case, you have two options: you could use the instructions to modify your system hosts file with the instructions below in Development with a custom IP or you can just use localhost.tiangolo.com, it is set up to point to localhost (to the IP 127.0.0.1) and all its subdomains too. And as it is an actual domain, the browsers will store the cookies you set during development, etc.

If you used the default CORS enabled domains while generating the project, localhost.tiangolo.com was configured to be allowed. If you didn't, you will need to add it to the list in the variable BACKEND_CORS_ORIGINS in the .env file.

To configure it in your stack, follow the section Change the development "domain" below, using the domain localhost.tiangolo.com.

After performing those steps you should be able to open: http://localhost.tiangolo.com and it will be server by your stack in localhost.

Check all the corresponding available URLs in the section at the end.

Development with a custom IP

If you are running Docker in an IP address different than 127.0.0.1 (localhost) and 192.168.99.100 (the default of Docker Toolbox), you will need to perform some additional steps. That will be the case if you are running a custom Virtual Machine, a secondary Docker Toolbox or your Docker is located in a different machine in your network.

In that case, you will need to use a fake local domain (dev.uipathmanager.com) and make your computer think that the domain is is served by the custom IP (e.g. 192.168.99.150).

If you used the default CORS enabled domains, dev.uipathmanager.com was configured to be allowed. If you want a custom one, you need to add it to the list in the variable BACKEND_CORS_ORIGINS in the .env file.

The new line might look like:

192.168.99.100    dev.uipathmanager.com

...that will make your computer think that the fake local domain is served by that custom IP, and when you open that URL in your browser, it will talk directly to your locally running server when it is asked to go to dev.uipathmanager.com and think that it is a remote server while it is actually running in your computer.

To configure it in your stack, follow the section Change the development "domain" below, using the domain dev.uipathmanager.com.

After performing those steps you should be able to open: http://dev.uipathmanager.com and it will be server by your stack in localhost.

Check all the corresponding available URLs in the section at the end.

Change the development "domain"

If you need to use your local stack with a different domain than localhost, you need to make sure the domain you use points to the IP where your stack is set up. See the different ways to achieve that in the sections above (i.e. using Docker Toolbox with local.dockertoolbox.tiangolo.com, using localhost.tiangolo.com or using dev.uipathmanager.com).

To simplify your Docker Compose setup, for example, so that the API docs (Swagger UI) knows where is your API, you should let it know you are using that domain for development. You will need to edit 1 line in 2 files.

DOMAIN=localhost
DOMAIN=localhost.tiangolo.com

That variable will be used by the Docker Compose files.

VUE_APP_DOMAIN_DEV=localhost
VUE_APP_DOMAIN_DEV=localhost.tiangolo.com

That variable will make your frontend communicate with that domain when interacting with your backend API, when the other variable VUE_APP_ENV is set to development.

After changing the two lines, you can re-start your stack with:

docker-compose up -d

and check all the corresponding available URLs in the section at the end.

Frontend development

See the frontend README for instructions.

Removing the frontend

If you are developing an API-only app and want to remove the frontend, you can do it easily:

Done, you have a frontend-less (api-only) app. 🔥 🚀


If you want, you can also remove the FRONTEND environment variables from:

But it would be only to clean them up, leaving them won't really have any effect either way.

Deployment

You can deploy the stack to a Docker Swarm mode cluster with a main Traefik proxy, set up using the ideas from DockerSwarm.rocks, to get automatic HTTPS certificates, etc.

And you can use CI (continuous integration) systems to do it automatically.

But you have to configure a couple things first.

Traefik network

This stack expects the public Traefik network to be named traefik-public, just as in the tutorials in DockerSwarm.rocks.

If you need to use a different Traefik public network name, update it in the docker-compose.yml files, in the section:

networks:
  traefik-public:
    external: true

Change traefik-public to the name of the used Traefik network. And then update it in the file .env:

TRAEFIK_PUBLIC_NETWORK=traefik-public

Persisting Docker named volumes

You need to make sure that each service (Docker container) that uses a volume is always deployed to the same Docker "node" in the cluster, that way it will preserve the data. Otherwise, it could be deployed to a different node each time, and each time the volume would be created in that new node before starting the service. As a result, it would look like your service was starting from scratch every time, losing all the previous data.

That's specially important for a service running a database. But the same problem would apply if you were saving files in your main backend service (for example, if those files were uploaded by your users, or if they were created by your system).

To solve that, you can put constraints in the services that use one or more data volumes (like databases) to make them be deployed to a Docker node with a specific label. And of course, you need to have that label assigned to one (only one) of your nodes.

Adding services with volumes

For each service that uses a volume (databases, services with uploaded files, etc) you should have a label constraint in your docker-compose.yml file.

To make sure that your labels are unique per volume per stack (for example, that they are not the same for prod and stag) you should prefix them with the name of your stack and then use the same name of the volume.

Then you need to have those constraints in your docker-compose.yml file for the services that need to be fixed with each volume.

To be able to use different environments, like prod and stag, you should pass the name of the stack as an environment variable. Like:

STACK_NAME=stag-uipathmanager-com sh ./scripts/deploy.sh

To use and expand that environment variable inside the docker-compose.yml files you can add the constraints to the services like:

version: '3'
services:
  db:
    volumes:
      - 'app-db-data:/var/lib/postgresql/data/pgdata'
    deploy:
      placement:
        constraints:
          - node.labels.${STACK_NAME?Variable not set}.app-db-data == true

note the ${STACK_NAME?Variable not set}. In the script ./scripts/deploy.sh, the docker-compose.yml would be converted, and saved to a file docker-stack.yml containing:

version: '3'
services:
  db:
    volumes:
      - 'app-db-data:/var/lib/postgresql/data/pgdata'
    deploy:
      placement:
        constraints:
          - node.labels.uipathmanager-com.app-db-data == true

Note: The ${STACK_NAME?Variable not set} means "use the environment variable STACK_NAME, but if it is not set, show an error Variable not set".

If you add more volumes to your stack, you need to make sure you add the corresponding constraints to the services that use that named volume.

Then you have to create those labels in some nodes in your Docker Swarm mode cluster. You can use docker-auto-labels to do it automatically.

docker-auto-labels

You can use docker-auto-labels to automatically read the placement constraint labels in your Docker stack (Docker Compose file) and assign them to a random Docker node in your Swarm mode cluster if those labels don't exist yet.

To do that, you can install docker-auto-labels:

pip install docker-auto-labels

And then run it passing your docker-stack.yml file as a parameter:

docker-auto-labels docker-stack.yml

You can run that command every time you deploy, right before deploying, as it doesn't modify anything if the required labels already exist.

(Optionally) adding labels manually

If you don't want to use docker-auto-labels or for any reason you want to manually assign the constraint labels to specific nodes in your Docker Swarm mode cluster, you can do the following:

$ docker node ls

// you would see an output like:

ID                            HOSTNAME               STATUS              AVAILABILITY        MANAGER STATUS
nfa3d4df2df34as2fd34230rm *   dog.example.com        Ready               Active              Reachable
2c2sd2342asdfasd42342304e     cat.example.com        Ready               Active              Leader
c4sdf2342asdfasd4234234ii     snake.example.com      Ready               Active              Reachable

then chose a node from the list. For example, dog.example.com.

docker node update --label-add uipathmanager-com.app-db-data=true dog.example.com
docker node update --label-add stag-uipathmanager-com.app-db-data=true cat.example.com

Deploy to a Docker Swarm mode cluster

There are 3 steps:

  1. Build your app images
  2. Optionally, push your custom images to a Docker Registry
  3. Deploy your stack

Here are the steps in detail:

  1. Build your app images
TAG=prod FRONTEND_ENV=production bash ./scripts/build.sh
  1. Optionally, push your images to a Docker Registry

Note: if the deployment Docker Swarm mode "cluster" has more than one server, you will have to push the images to a registry or build the images in each server, so that when each of the servers in your cluster tries to start the containers it can get the Docker images for them, pulling them from a Docker Registry or because it has them already built locally.

If you are using a registry and pushing your images, you can omit running the previous script and instead using this one, in a single shot.

TAG=prod FRONTEND_ENV=production bash ./scripts/build-push.sh
  1. Deploy your stack
DOMAIN=uipathmanager.com \
TRAEFIK_TAG=uipathmanager.com \
STACK_NAME=uipathmanager-com \
TAG=prod \
bash ./scripts/deploy.sh

If you change your mind and, for example, want to deploy everything to a different domain, you only have to change the DOMAIN environment variable in the previous commands. If you wanted to add a different version / environment of your stack, like "preproduction", you would only have to set TAG=preproduction in your command and update these other environment variables accordingly. And it would all work, that way you could have different environments and deployments of the same app in the same cluster.

Deployment Technical Details

Building and pushing is done with the docker-compose.yml file, using the docker-compose command. The file docker-compose.yml uses the file .env with default environment variables. And the scripts set some additional environment variables as well.

The deployment requires using docker stack instead of docker-swarm, and it can't read environment variables or .env files. Because of that, the deploy.sh script generates a file docker-stack.yml with the configurations from docker-compose.yml and injecting the environment variables in it. And then uses it to deploy the stack.

You can do the process by hand based on those same scripts if you wanted. The general structure is like this:

# Use the environment variables passed to this script, as TAG and FRONTEND_ENV
# And re-create those variables as environment variables for the next command
TAG=${TAG?Variable not set} \
# Set the environment variable FRONTEND_ENV to the same value passed to this script with
# a default value of "production" if nothing else was passed
FRONTEND_ENV=${FRONTEND_ENV-production?Variable not set} \
# The actual comand that does the work: docker-compose
docker-compose \
# Pass the file that should be used, setting explicitly docker-compose.yml avoids the
# default of also using docker-compose.override.yml
-f docker-compose.yml \
# Use the docker-compose sub command named "config", it just uses the docker-compose.yml
# file passed to it and prints their combined contents
# Put those contents in a file "docker-stack.yml", with ">"
config > docker-stack.yml

# The previous only generated a docker-stack.yml file,
# but didn't do anything with it yet

# docker-auto-labels makes sure the labels used for constraints exist in the cluster
docker-auto-labels docker-stack.yml

# Now this command uses that same file to deploy it
docker stack deploy -c docker-stack.yml --with-registry-auth "${STACK_NAME?Variable not set}"

Continuous Integration / Continuous Delivery

If you use GitLab CI, the included .gitlab-ci.yml can automatically deploy it. You may need to update it according to your GitLab configurations.

If you use any other CI / CD provider, you can base your deployment from that .gitlab-ci.yml file, as all the actual script steps are performed in bash scripts that you can easily re-use.

GitLab CI is configured assuming 2 environments following GitLab flow:

If you need to add more environments, for example, you could imagine using a client-approved preprod branch, you can just copy the configurations in .gitlab-ci.yml for stag and rename the corresponding variables. The Docker Compose file and environment variables are configured to support as many environments as you need, so that you only need to modify .gitlab-ci.yml (or whichever CI system configuration you are using).

Docker Compose files and env vars

There is a main docker-compose.yml file with all the configurations that apply to the whole stack, it is used automatically by docker-compose.

And there's also a docker-compose.override.yml with overrides for development, for example to mount the source code as a volume. It is used automatically by docker-compose to apply overrides on top of docker-compose.yml.

These Docker Compose files use the .env file containing configurations to be injected as environment variables in the containers.

They also use some additional configurations taken from environment variables set in the scripts before calling the docker-compose command.

It is all designed to support several "stages", like development, building, testing, and deployment. Also, allowing the deployment to different environments like staging and production (and you can add more environments very easily).

They are designed to have the minimum repetition of code and configurations, so that if you need to change something, you have to change it in the minimum amount of places. That's why files use environment variables that get auto-expanded. That way, if for example, you want to use a different domain, you can call the docker-compose command with a different DOMAIN environment variable instead of having to change the domain in several places inside the Docker Compose files.

Also, if you want to have another deployment environment, say preprod, you just have to change environment variables, but you can keep using the same Docker Compose files.

The .env file

The .env file is the one that contains all your configurations, generated keys and passwords, etc.

Depending on your workflow, you could want to exclude it from Git, for example if your project is public. In that case, you would have to make sure to set up a way for your CI tools to obtain it while building or deploying your project.

One way to do it could be to add each environment variable to your CI/CD system, and updating the docker-compose.yml file to read that specific env var instead of reading the .env file.

URLs

These are the URLs that will be used and generated by the project.

Production URLs

Production URLs, from the branch production.

Frontend: https://uipathmanager.com

Backend: https://uipathmanager.com/api/

Automatic Interactive Docs (Swagger UI): https://uipathmanager.com/docs

Automatic Alternative Docs (ReDoc): https://uipathmanager.com/redoc

PGAdmin: https://pgadmin.uipathmanager.com

Flower: https://flower.uipathmanager.com

Staging URLs

Staging URLs, from the branch master.

Frontend: https://stag.uipathmanager.com

Backend: https://stag.uipathmanager.com/api/

Automatic Interactive Docs (Swagger UI): https://stag.uipathmanager.com/docs

Automatic Alternative Docs (ReDoc): https://stag.uipathmanager.com/redoc

PGAdmin: https://pgadmin.stag.uipathmanager.com

Flower: https://flower.stag.uipathmanager.com

Development URLs

Development URLs, for local development.

Frontend: http://localhost

Backend: http://localhost/api/

Automatic Interactive Docs (Swagger UI): https://localhost/docs

Automatic Alternative Docs (ReDoc): https://localhost/redoc

PGAdmin: http://localhost:5050

Flower: http://localhost:5555

Traefik UI: http://localhost:8090

Development with Docker Toolbox URLs

Development URLs, for local development.

Frontend: http://local.dockertoolbox.tiangolo.com

Backend: http://local.dockertoolbox.tiangolo.com/api/

Automatic Interactive Docs (Swagger UI): https://local.dockertoolbox.tiangolo.com/docs

Automatic Alternative Docs (ReDoc): https://local.dockertoolbox.tiangolo.com/redoc

PGAdmin: http://local.dockertoolbox.tiangolo.com:5050

Flower: http://local.dockertoolbox.tiangolo.com:5555

Traefik UI: http://local.dockertoolbox.tiangolo.com:8090

Development with a custom IP URLs

Development URLs, for local development.

Frontend: http://dev.uipathmanager.com

Backend: http://dev.uipathmanager.com/api/

Automatic Interactive Docs (Swagger UI): https://dev.uipathmanager.com/docs

Automatic Alternative Docs (ReDoc): https://dev.uipathmanager.com/redoc

PGAdmin: http://dev.uipathmanager.com:5050

Flower: http://dev.uipathmanager.com:5555

Traefik UI: http://dev.uipathmanager.com:8090

Development in localhost with a custom domain URLs

Development URLs, for local development.

Frontend: http://localhost.tiangolo.com

Backend: http://localhost.tiangolo.com/api/

Automatic Interactive Docs (Swagger UI): https://localhost.tiangolo.com/docs

Automatic Alternative Docs (ReDoc): https://localhost.tiangolo.com/redoc

PGAdmin: http://localhost.tiangolo.com:5050

Flower: http://localhost.tiangolo.com:5555

Traefik UI: http://localhost.tiangolo.com:8090

Project generation and updating, or re-generating

This project was generated using https://github.com/whythawk/full-stack-fastapi-postgresql with:

pip install cookiecutter
cookiecutter https://github.com/whythawk/full-stack-fastapi-postgresql

You can check the variables used during generation in the file cookiecutter-config-file.yml.

You can generate the project again with the same configurations used the first time.

That would be useful if, for example, the project generator (tiangolo/full-stack-fastapi-postgresql) was updated and you wanted to integrate or review the changes.

You could generate a new project with the same configurations as this one in a parallel directory. And compare the differences between the two, without having to overwrite your current code but being able to use the same variables used for your current project.

To achieve that, the generated project includes the file cookiecutter-config-file.yml with the current variables used.

You can use that file while generating a new project to reuse all those variables.

For example, run:

$ cookiecutter --config-file ./cookiecutter-config-file.yml --output-dir ../project-copy https://github.com/whythawk/full-stack-fastapi-postgresql

That will use the file cookiecutter-config-file.yml in the current directory (in this project) to generate a new project inside a sibling directory project-copy.