pip install dbxs
.DBXS is query organizer. It puts your collection of SQL queries into a
traditional data-access layer, leveraging Python's built-in
typing.Protocol
provide type-safety and convenience.
DBXS aims to provide 5 properties to applications that use it:
DBXS is plain old SQL. Aside from interpolating placeholders based on your chosen driver's paramstyle, it does not parse or interpret the SQL statements that you provide it. It does not generate SQL. What you typed is what gets executed. Use whatever special features your database provides, with no layers in the middle.
It is difficult to accidentally write an SQL injection vulnerability with DBXS.
By requiring that SQL statements be written at import time, ahead of your
application code even executing, attacker-controlled data from your requests is
unlikely to be available. If your code is using DBXS idiomatically (i.e.: not
calling out to .execute(...)
), you can know it's safe.
Although the SQL is “raw”, your results are not. When you declare your queries with DBXS, you describe the shape of their results in terms of dataclasses, giving your application code structured, documented data objects to consume. Query parameters and results are all typed.
As you're the one generating your SQL, you're still responsible for testing that your query actually consumes and produces the types you have claimed it does, but once you've written and tested a query.
When you use DBXS, your queries are all collected into Protocol
classes that
define the interfaces to your data stores. This provides a small, clear set of
locations to inspect for database query issues, both for human readers and for
metaprogramming, rather than smearing database concerns across the entirety of
your application.
Additionally, by providing a simple async context manager for transaction
management, it is far easier to scan your code for transaction boundaries by
looking for async with transaction
than looking for calls to commit
and
rollback
.
The dbxs.testing
module provides support for testing your database interfaces
with minimal set-up. You can write a unit test that uses your database
interfaces in only a few lines of code, like so:
class MyDBXSTest(TestCase):
@immediateTest()
def test_myQuery(self, pool: MemoryPool) -> None:
async with transaction(pool) as c:
access = myAccessor(c)
await access.createFoo("1", "hello")
self.assertEqual((await access.getFoo("1")).name, "hello")
(As DBXS is still an alpha-stage project, this testing support is currently
restricted to SQLite and stdlib unittest
, but support for arbitrary database
drivers and pytest is definitely on the roadmap.)
Using it looks like this:
@dataclass
class Quote:
db: QuoteDB
id: int
contents: str
from dbxs import accessor, many, one, query, statement
class QuoteDB(Protocol):
@query(sql="SELECT id, contents FROM quote WHERE id={id}", load=one(Quote))
async def quoteByID(self, id: int) -> Quote: ...
@query(sql="SELECT id, contents FROM quote", load=many(Quote))
def allQuotes(self) -> AsyncIterable[Quote]: ...
@statement(sql="INSERT INTO quote (contents) VALUES ({text})")
async def addQuote(self, text: str) -> None: ...
from dbxs.dbapi import DBAPIConnection
def sqliteWithSchema() -> DBAPIConnection:
c = connect(":memory:")
c.execute(
"CREATE TABLE quote (contents, id INTEGER PRIMARY KEY AUTOINCREMENT)"
)
c.commit()
return c
from dbxs.adapters.dbapi_twisted import adaptSynchronousDriver
driver = adaptSynchronousDriver(sqliteWithSchema, paramstyle)
quotes = accessor(QuoteDB)
async def main() -> None:
from dbxs.async_dbapi import transaction
async with transaction(driver) as t:
quotedb: QuoteDB = quotes(t)
await quotedb.addQuote("hello, world")
async for quote in quotedb.allQuotes():
matched = (await quotedb.quoteByID(quote.id)) == quote
print(f"quote ({quote.id}) {quote.contents!r} {matched}")
When interfacing with SQL in Python, you have a few choices which predate DBXS.
You can use something low-level, specifically a DB-API 2.0
driver, directly. You call connect(...)
to get a connection, .cursor()
on that connection to get a cursor, and
.execute
on that cursor with some SQL and some parameters to tell the
database to do stuff. The benefits of this approach are clear; it's very
straightforward. If you want to do something, read the database documentation
for the SQL syntax to do the thing you want, then type in that SQL.
However, the downsides are also readily apparent, the two major ones being:
To mitigate these disadvantages, you might use a higher-level tool like the Django ORM or SQLAlchemy. These tools are very powerful, and allow for many things that DBXS does not, such as dynamically composing queries. They provide strong affordances to make the right thing (avoiding SQL injection) the easy thing, and they give you powerful higher-level types as query inputs and results.
However, those introduce new problems:
It's a big increase in complexity. Everyone who contributes to your project
needs to understand both the SQL layer of the database and your ORM or
expression layer. While DBXS comprises a handful of functions and classes
(mostly query
, one
, many
, statement
, transaction
, and accessor
,
with adaptSynchronousDriver
being the additional bit of glue most
applications will need), an ORM might include dozens or even hundreds of
additional functions and data structures.
This problem compounds, because if your database has any feature that you
want to use beyond the somewhat fuzzy boundaries of “standard” SQL, even
something as simple as INSERT…ON CONFLICT DO NOTHING
, you need to learn
about that feature from the database's SQL dialect
documentation, then learn it
again as you learn from the specialized dialect
support
of your chosen library, assuming it wraps the feature you want. Some
features may only be supported as additional libraries you need to depend
on or you may have to wait
for them to be
implemented.
Finally, there is a shortcoming that all these approaches have in common: they all encourage queries to be defined on an ad-hoc basis, in the bodies of methods and functions as the application is running. Especially when using a higher-level library where the SQL is generated rather than manually specified, this can make it very difficult to go from an entry in a database log to the location in the application code that executed it.
Does this seem familiar? Although I hadn't heard about it at the time, I have since learned that this is a parallel invention of JDBI's declarative API. If you notice that the ideas are similar but the terminology is all ranodmly different, that's why.
DBXS has (as far as I know) mostly not been used in production, and thus should be considered alpha quality. While its simple implementation, small code size, and good test coverage should make productionizing it a small amount of effort, it does still have some major limitations compared to what I would consider a “final” release:
DBXS only supports async database interfaces. Many database applications expect a synchronous interface, and [it should have one] (https://github.com/glyph/DBXS/issues/18).
DBXS only supports synchronous drivers wrapped using Twisted's threadpool. It should support asyncio, as well as some of the native database drivers for the asyncio ecosystem.