yakshaveinc / linux

software engineering for Linux projects
The Unlicense
1 stars 3 forks source link

Generator for database schema pictures #29

Open abitrolly opened 5 years ago

abitrolly commented 5 years ago

Nothing helps better to understand existing web app than DB schema, and the best schema is graphical and up to date. Up to date schema is automatically updated during CI builds. There a few ways to generate graphical database schemas automatically:

  1. SQLAlchemy https://github.com/sqlalchemy/sqlalchemy/wiki/SchemaDisplay
  2. SchemaSpy https://github.com/schemaspy/schemaspy/
  3. SchemaCrawler https://www.schemacrawler.com/

List of projects that could benefit from having an update picture of DB schema in their database.


List of projects where it is done.

abitrolly commented 5 years ago

SQLAchemy + SQLite

Find all databases on local machine.

$ time (find / -type f -print0 2> /dev/null | xargs -0 -n 1 -P 4 file | grep  "SQLite 3.x database" | cut -d ':' -f 1 > sqlites.find.txt)
5199.94s user 2388.60s system 252% cpu 50:10.46 total
$ wc -l sqlites.find.txt
936 sqlites.find.txt
abitrolly commented 5 years ago
abitrolly commented 4 years ago

SchemaSpy + PostgreSQL

It would be unwise to run external tool on production database. It is better to operate from DB backup dump. SchemaSpy can not operate directly on database dump (https://github.com/schemaspy/schemaspy/issues/584) and therefore a temporary PostreSQL container is needed.

  1. Prepare data

Use pg_dump to get the data from existing DB. If you got binary backup (like from Heroku), convert it with pg_restore to .sql file. The file then can be copied into /docker-entrypoint-initdb.d to create DB when container is created.

pg_restore --no-owner mydb.dump > mydb.sql

Make sure pg_restore is the same major version as pg_dump or you may get problems like ERROR: schema "public" already exists.

--no-owner is necessary to avoid problems like ERROR: role "user123" does not exist

  1. Load data into PostgreSQL

DB in container is initialized using initialization scripts. We need to run PostgreSQL container in a pod to make SchemaSpy and PostgreSQL containers to see each other.

podman pod create --name pg
# run postgres container with init db script
podman run --pod pg -v "$(pwd)"/mydb.sql:/docker-entrypoint-initdb.d/mydb.sql:Z postgres

This creates database named postgres by default.

Wait until DB in pod is started.

podman run --pod pg postgres pg_isready --timeout=30 --host localhost --port 5432

List databases in the pod.

podman run --pod pg postgres psql -h localhost --username postgres --list
  1. Run SchemaSpy

To get DB report in ./_html dir.

# run SchemaSpy in container named `spied` and retrieve report
podman run --pod pg --name spied schemaspy/schemaspy:snapshot -host localhost \
    -port 5432 -t pgsql11 -u postgres -db postgres
podman cp spied:/output ./_html
podman pod rm pg -f
abitrolly commented 2 years ago

Released https://pypi.org/project/sqlite2png GitHub https://github.com/abitrolly/sqlite2png

pip install sqlite2png
python -m sqlite2png path/to/sqlite.db