simonw / db-to-sqlite

CLI tool for exporting tables or queries from any SQL database to a SQLite file
Apache License 2.0
370 stars 29 forks source link
datasette datasette-io datasette-tool sqlalchemy sqlite

db-to-sqlite

PyPI Changelog Tests License

CLI tool for exporting tables or queries from any SQL database to a SQLite file.

Installation

Install from PyPI like so:

pip install db-to-sqlite

If you want to use it with MySQL, you can install the extra dependency like this:

pip install 'db-to-sqlite[mysql]'

Installing the mysqlclient library on OS X can be tricky - I've found this recipe to work (run that before installing db-to-sqlite).

For PostgreSQL, use this:

pip install 'db-to-sqlite[postgresql]'

Usage

Usage: db-to-sqlite [OPTIONS] CONNECTION PATH

  Load data from any database into SQLite.

  PATH is a path to the SQLite file to create, e.c. /tmp/my_database.db

  CONNECTION is a SQLAlchemy connection string, for example:

      postgresql://localhost/my_database
      postgresql://username:passwd@localhost/my_database

      mysql://root@localhost/my_database
      mysql://username:passwd@localhost/my_database

  More: https://docs.sqlalchemy.org/en/13/core/engines.html#database-urls

Options:
  --version                     Show the version and exit.
  --all                         Detect and copy all tables
  --table TEXT                  Specific tables to copy
  --skip TEXT                   When using --all skip these tables
  --redact TEXT...              (table, column) pairs to redact with ***
  --sql TEXT                    Optional SQL query to run
  --output TEXT                 Table in which to save --sql query results
  --pk TEXT                     Optional column to use as a primary key
  --index-fks / --no-index-fks  Should foreign keys have indexes? Default on
  -p, --progress                Show progress bar
  --postgres-schema TEXT        PostgreSQL schema to use
  --help                        Show this message and exit.

For example, to save the content of the blog_entry table from a PostgreSQL database to a local file called blog.db you could do this:

db-to-sqlite "postgresql://localhost/myblog" blog.db \
    --table=blog_entry

You can specify --table more than once.

You can also save the data from all of your tables, effectively creating a SQLite copy of your entire database. Any foreign key relationships will be detected and added to the SQLite database. For example:

db-to-sqlite "postgresql://localhost/myblog" blog.db \
    --all

When running --all you can specify tables to skip using --skip:

db-to-sqlite "postgresql://localhost/myblog" blog.db \
    --all \
    --skip=django_migrations

If you want to save the results of a custom SQL query, do this:

db-to-sqlite "postgresql://localhost/myblog" output.db \
    --output=query_results \
    --sql="select id, title, created from blog_entry" \
    --pk=id

The --output option specifies the table that should contain the results of the query.

Using db-to-sqlite with PostgreSQL schemas

If the tables you want to copy from your PostgreSQL database aren't in the default schema, you can specify an alternate one with the --postgres-schema option:

db-to-sqlite "postgresql://localhost/myblog" blog.db \
    --all \
    --postgres-schema my_schema

Using db-to-sqlite with MS SQL

The best way to get the connection string needed for the MS SQL connections below is to use urllib from the Standard Library as below

params = urllib.parse.quote_plus(
    "DRIVER={SQL Server Native Client 11.0};"
    "SERVER=localhost;"
    "DATABASE=my_database;"
    "Trusted_Connection=yes;"
)

The above will resolve to

DRIVER%3D%7BSQL+Server+Native+Client+11.0%7D%3B+SERVER%3Dlocalhost%3B+DATABASE%3Dmy_database%3B+Trusted_Connection%3Dyes

You can then use the string above in the odbc_connect below

mssql+pyodbc:///?odbc_connect=DRIVER%3D%7BSQL+Server+Native+Client+11.0%7D%3B+SERVER%3Dlocalhost%3B+DATABASE%3Dmy_database%3B+Trusted_Connection%3Dyes
mssql+pyodbc:///?odbc_connect=DRIVER%3D%7BSQL+Server+Native+Client+11.0%7D%3B+SERVER%3Dlocalhost%3B+DATABASE%3Dmy_database%3B+UID%3Dusername%3B+PWD%3Dpasswd

Using db-to-sqlite with Heroku Postgres

If you run an application on Heroku using their Postgres database product, you can use the heroku config command to access a compatible connection string:

$ heroku config --app myappname | grep HEROKU_POSTG
HEROKU_POSTGRESQL_OLIVE_URL: postgres://username:password@ec2-xxx-xxx-xxx-x.compute-1.amazonaws.com:5432/dbname

You can pass this to db-to-sqlite to create a local SQLite database with the data from your Heroku instance.

You can even do this using a bash one-liner:

$ db-to-sqlite $(heroku config --app myappname | grep HEROKU_POSTG | cut -d: -f 2-) \
    /tmp/heroku.db --all -p
1/23: django_migrations
...
17/23: blog_blogmark
[####################################]  100%
...

Related projects

Development

To set up this tool locally, first checkout the code. Then create a new virtual environment:

cd db-to-sqlite
python3 -m venv venv
source venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

This will skip tests against MySQL or PostgreSQL if you do not have their additional dependencies installed.

You can install those extra dependencies like so:

pip install -e '.[test_mysql,test_postgresql]'

You can alternative use pip install psycopg2-binary if you cannot install the psycopg2 dependency used by the test_postgresql extra.

See Running a MySQL server using Homebrew for tips on running the tests against MySQL on macOS, including how to install the mysqlclient dependency.

The PostgreSQL and MySQL tests default to expecting to run against servers on localhost. You can use environment variables to point them at different test database servers:

The database you indicate in the environment variable - test_db_to_sqlite by default - will be deleted and recreated on every test run.