absent1706 / sqlalchemy-mixins

Active Record, Django-like queries, nested eager load and beauty __repr__ for SQLAlchemy
MIT License
756 stars 67 forks source link
activerecord django-like eager-load eager-loading orm orm-extension sqlalchemy

example workflow PyPI version

SQLAlchemy mixins

Note: As of v1.3, only python 3.5+ is supported.

A pack of framework-agnostic, easy-to-integrate and well tested mixins for SQLAlchemy ORM.

Heavily inspired by Django ORM and Eloquent ORM

Why it's cool:

Russian readers, see related article on habrahabr.ru

Table of Contents

  1. Installation
  2. Quick Start
    1. Framework-agnostic
    2. Usage with Flask-SQLAlchemy
  3. Features
    1. Active Record
      1. CRUD
      2. Querying
    2. Eager Load
    3. Django-like queries
      1. Filter and sort by relations
      2. Automatic eager load relations
    4. All-in-one: smart_query
    5. Beauty __repr__
    6. Serialize to dict
    7. Timestamps
  4. Internal architecture notes
  5. Comparison with existing solutions
  6. Changelog

Installation

Use pip

pip install sqlalchemy_mixins

Run tests

python -m unittest discover sqlalchemy_mixins/

Quick Start

Framework-agnostic

Here's a quick demo of what our mixins can do.

bob = User.create(name='Bob')
post1 = Post.create(body='Post 1', user=bob, rating=3)
post2 = Post.create(body='long-long-long-long-long body', rating=2,
                    user=User.create(name='Bill'),
                    comments=[Comment.create(body='cool!', user=bob)])

# filter using operators like 'in' and 'contains' and relations like 'user'
# will output this beauty: <Post #1 body:'Post1' user:'Bill'>
print(Post.where(rating__in=[2, 3, 4], user___name__like='%Bi%').all())
# joinedload post and user
print(Comment.with_joined(Comment.user, Comment.post).first())
# subqueryload posts
print(User.with_subquery(User.posts).first())
# sort by rating DESC, user name ASC
print(Post.sort('-rating', 'user___name').all())
# created_at, updated_at timestamps added automatically
print("Created Bob at ", bob.created_at)   
# serialize to dict, with relationships
print(bob.to_dict(nested=True).all())

icon See full example

To interactively play with this example from CLI, install iPython and type ipython -i examples\all_features.py

Usage with Flask-SQLAlchemy

import sqlalchemy as sa
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy_mixins import AllFeaturesMixin

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite://'
db = SQLAlchemy(app)

######### Models ######### 
class BaseModel(db.Model, AllFeaturesMixin):
    __abstract__ = True
    pass

class User(BaseModel):
    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String)

######## Initialize ########
BaseModel.set_session(db.session)

######## Create test entity ########
db.create_all()
user = User.create(name='bob')
print(user)

Autocommit

This library relies on SQLAlchemy's autocommit flag. It needs to be set to True when initializing the session i.e:

session = scoped_session(sessionmaker(bind=engine, autocommit=True))
BaseModel.set_session(session)

or with Flask-SQLAlchemy

db = SQLAlchemy(app, session_options={'autocommit': True})

Features

Main features are

Active Record

provided by ActiveRecordMixin

SQLAlchemy's Data Mapper pattern is cool, but Active Record pattern is easiest and more DRY.

Well, we implemented it on top of Data Mapper! All we need is to just inject session into ORM class while bootstrapping our app:

BaseModel.set_session(session)
# now we have access to BaseOrmModel.session property

CRUD

We all love SQLAlchemy, but doing CRUD is a bit tricky there.

For example, creating an object needs 3 lines of code:

bob = User(name='Bobby', age=1)
session.add(bob)
session.flush()

Well, having access to session from model, we can just write

bob = User.create(name='Bobby', age=1)

that's how it's done in Django ORM and Peewee

update and delete methods are provided as well

bob.update(name='Bob', age=21)
bob.delete()

And, as in Django and Eloquent, we can quickly retrieve object by id

User.find(1) # instead of session.query(User).get(1)

and fail if such id doesn't exist

User.find_or_fail(123987) # will raise sqlalchemy_mixins.ModelNotFoundError

icon See full example and tests

Querying

As in Flask-SQLAlchemy, Peewee and Django ORM, you can quickly query some class

User.query # instead of session.query(User)

Also we can quickly retrieve first or all objects:

User.first() # instead of session.query(User).first()
User.all() # instead of session.query(User).all()

icon See full example and tests

Eager load

provided by EagerLoadMixin

Nested eager load

If you use SQLAlchemy's eager loading, you may find it not very convenient, especially when we want, say, load user, all his posts and comments to every his post in the same query.

Well, now you can easily set what ORM relations you want to eager load

User.with_({
    User.posts: {
        Post.comments: {
            Comment.user: JOINED
        }
    }
}).all()

Subquery load

Sometimes we want to load relations in separate query, i.e. do subqueryload. For example, we load posts on page like this, and for each post we want to have user and all comments (and comment authors).

To speed up query, we load comments in separate query, but, in this separate query, join user

from sqlalchemy_mixins import JOINED, SUBQUERY
Post.with_({
    Post.user: JOINED, # joinedload user
    Post.comments: (SUBQUERY, {  # load comments in separate query
        Comment.user: JOINED  # but, in this separate query, join user
    })
}).all()

Here, posts will be loaded on first query, and comments with users - in second one. See SQLAlchemy docs for explaining relationship loading techniques.

Quick eager load

For simple cases, when you want to just joinedload or subqueryload a few relations, we have easier syntax for you:

Comment.with_joined(Comment.user, Comment.post).first()
User.with_subquery(User.posts).all()

See full example and tests

Filter and sort by relations

provided by SmartQueryMixin

Django-like queries

We implement Django-like field lookups and automatic relation joins.

It means you can filter and sort dynamically by attributes defined in strings!

So, having defined Post model with Post.user relationship to User model, you can write

Post.where(rating__gt=2, user___name__like='%Bi%').all() # post rating > 2 and post user name like ...
Post.sort('-rating', 'user___name').all() # sort by rating DESC, user name ASC

(___ splits relation and attribute, __ splits attribute and operator)

If you need more flexibility, you can use low-level filter_expr method session.query(Post).filter(*Post.filter_expr(rating__gt=2, body='text')), see example.

It's like filter_by in SQLALchemy, but also allows magic operators like rating__gt.

Note: filter_expr method is very low-level and does NOT do magic Django-like joins. Use smart_query for that.

All relations used in filtering/sorting should be explicitly set, not just being a backref

In our example, Post.user relationship should be defined in Post class even if User.posts is defined too.

So, you can't type

class User(BaseModel):
    # ...
    user = sa.orm.relationship('User', backref='posts')

and skip defining Post.user relationship. You must define it anyway:

class Post(BaseModel):
    # ...
    user = sa.orm.relationship('User') # define it anyway

For DRY-ifying your code and incapsulating business logic, you can use SQLAlchemy's hybrid attributes and hybrid_methods. Using them in our filtering/sorting is straightforward (see examples and tests).

See full example and tests

Automatic eager load relations

Well, as SmartQueryMixin does auto-joins for filtering/sorting, there's a sense to tell sqlalchemy that we already joined that relation.

So that relations are automatically set to be joinedload if they were used for filtering/sorting.

So, if we write

comments = Comment.where(post___public=True, post___user___name__like='Bi%').all()

then no additional query will be executed if we will access used relations

comments[0].post
comments[0].post.user

Cool, isn't it? =)

See full example and tests

All-in-one: smart_query

Filter, sort and eager load in one smartest method.

provided by SmartQueryMixin

In real world, we want to filter, sort and also eager load some relations at once. Well, if we use the same, say, User.posts relation in filtering and sorting, it should not be joined twice.

That's why we combined filter, sort and eager load in one smartest method:

Comment.smart_query(
    filters={
        'post___public': True,
        'user__isnull': False
    },
    sort_attrs=['user___name', '-created_at'],
    schema={
        Comment.post: {
            Post.user: JOINED
        }
    }).all()

New in 0.2.3 In real world, you may need to "smartly" apply filters/sort/eagerload to any arbitrary query. And you can do this with standalone smart_query function:

smart_query(any_query, filters=...)

It's especially useful for filtering/sorting/eagerloading relations with lazy='dynamic' for pages like this:

smart_query(user.comments_, filters=...)

See this example

Experimental Additional logic (OR, AND, NOT etc) can be expressed using a nested structure for filters, with sqlalchemy operators (or any callable) as keys:

from sqlalchemy import or_
Comment.smart_query(filters={ or_: {
    'post___public': True, 
    'user__isnull': False
}})

See this example for more details

See full example and tests

Beauty __repr__

provided by ReprMixin

As developers, we need to debug things with convenience. When we play in REPL, we can see this

>>> session.query(Post).all()
[<myapp.models.Post object at 0x04287A50>, <myapp.models.Post object at 0x04287A90>]

Well, using our mixin, we can have more readable output with post IDs:

>>> session.query(Post).all()
[<Post #11>, <Post #12>]

Even more, in Post model, we can define what else (except id) we want to see:

class User(BaseModel):
    __repr_attrs__ = ['name']
    # ...

class Post(BaseModel):
    __repr_attrs__ = ['user', 'body'] # body is just column, user is relationship
    # ...

Now we have

>>> session.query(Post).all()
[<Post #11 user:<User #1 'Bill'> body:'post 11'>,
 <Post #12 user:<User #2 'Bob'> body:'post 12'>]

Long attributes will be cut:

long_post = Post(body='Post 2 long-long body', user=bob)

>>> long_post
<Post #2 body:'Post 2 long-lon...' user:<User #1 'Bob'>>

And you can customize max __repr__ length:

class Post(BaseModel):
    # ...
    __repr_max_length__ = 25
    # ...

>>> long_post
<Post #2 body:'Post 2 long-long body' user:<User #1 'Bob'>>   

See full example and tests

Serialize to dict

provided by SerializeMixin

You can convert your model to dict.

# 1. Without relationships
#
# {'id': 1, 'name': 'Bob'}
print(user.to_dict())

# 2. With relationships
#
# {'id': 1,
# 'name': 'Bob',
# 'posts': [{'body': 'Post 1', 'id': 1, 'user_id': 1},
#           {'body': 'Post 2', 'id': 2, 'user_id': 1}]}
print(user.to_dict(nested=True))

See full example

Timestamps

provided by TimestampsMixin

You can view the created and updated timestamps.

bob = User(name="Bob")
session.add(bob)
session.flush()

print("Created Bob:    ", bob.created_at)
# Created Bob:     2019-03-04 03:53:53.606765

print("Pre-update Bob: ", bob.updated_at)
# Pre-update Bob:  2019-03-04 03:53:53.606769

time.sleep(2)

bob.name = "Robert"
session.commit()

print("Updated Bob:    ", bob.updated_at)
# Updated Bob:     2019-03-04 03:53:58.613044

See full example

Internal architecture notes

Some mixins re-use the same functionality. It lives in SessionMixin (session access) and InspectionMixin (inspecting columns, relations etc.) and other mixins inherit them.

You can use these mixins standalone if you want.

Comparison with existing solutions

There're a lot of extensions for SQLAlchemy, but most of them are not so universal.

Active record

We found several implementations of this pattern.

ActiveAlchemy

Cool, but it forces you to use their own way to instantiate SQLAlchemy while to use ActiveRecordMixin you should just make you model to inherit it.

Flask-ActiveRecord

Cool, but tightly coupled with Flask.

sqlalchemy-activerecord

Framework-agnostic, but lacks of functionality (only save method is provided) and Readme.

Django-like queries

There exists sqlalchemy-django-query package which does similar things and it's really awesome.

But:

Beauty __repr__

sqlalchemy-repr already does this, but there you can't choose which columns to output. It simply prints all columns, which can lead to too big output.

Changelog

v0.2

More clear methods in EagerLoadMixin:

Other changes in EagerLoadMixin:

v0.2.1

Fix in InspectionMixin.columns property.

It didn't return columns inherited from other class. Now it works correct:

class Parent(BaseModel):
    __tablename__ = 'parent'
    id = sa.Column(sa.Integer, primary_key=True)

class Child(Parent):
    some_prop = sa.Column(sa.String)

Child.columns # before it returned ['some_prop']
              # now it returns ['id', 'some_prop'] 

v0.2.2

Fixed bug in ReprMixin: it crashed for objects without ID (newly created ones, not added yet to the session).

v0.2.3

SmartQueryMixin: decoupled smart_query function from ORM classes so now you can use it with any query like

smart_query(any_query, filters=...)

See description (at the end of paragraph) and example

v1.0.1

  1. Added SerializationMixin (thanks, jonatasleon)

  2. Added ne operator (thanks, https://github.com/sophalch), so now you can write something like

Post.where(rating__ne=2).all()

v1.2

This version contains breaking change, reverted in v1.2.1. So:

  • v1.2 was removed from PyPi to avoid confusions
  • for those who already downloaded v1.2, we hardly recommend to switch to 1.2.1.

Just use v1.2.1 instead

By mistake, v1.2 code was released on PyPi as v1.1. It has been deleted from PyPi to avoid confusion. Sorry for any inconvenience guys.

  1. Removed Python 2, Python 3.2 compatibility.

  2. Added Python 3.7, 3.8 compatibility.

  3. Added TimestampsMixin (thanks, jonatasleon).

  4. (Breaking change, fixed in v1.2.1) TimestampsMixin was included it to AllFeaturesMixin which means created_at and updated_at fields were added to all models using AllFeaturesMixin which means you need to write migrations adding these fields.

  5. Added contains operator (thanks, alexbredo).

  6. Added date comparison operators (thanks, proteusvacuum), so now you can write something like

Post.where(created_at__year_ge=2014).all()
Post.where(created_at__month_gt=10).all()
Post.where(created_at__day_le=30).all()

v1.2.1

Reverted breaking change introduced in 1.2:

removed TimestampsMixin from AllFeaturesMixin. This addition in v1.2 forced package users to write and run migration to add created_at and updated_at fields to all tables whose ORM models used AllFeaturesMixin. Now you should add TimestampsMixin separately:

   class BaseModel(Base, AllFeaturesMixin, TimestampsMixin):
       # ...

v1.3

Add support for SQLAlchemy 1.4

v2.0.0

This version contains breaking changes in multiple methods i.e methods that simplify eager loading. The use of strings while eager loading has been removed completely in SQLAlchemy 2.0. To much this behaviour, we have also removed the use of strings when eager loading

  1. Migrate to SQLAlchemy 2.0
  2. All methods in the EagerLoadMixin no longer accept strings. Note This means that you can only pass direct relationships.
  3. The schema parameter of the smart_query method/function no longer accepts string keys.
  4. Dropped Python 3.6
  5. Add Python 3.10 compatibility