Police-Data-Accessibility-Project / data-sources-app

An API and UI for using and maintaining the Data Sources database
MIT License
3 stars 5 forks source link

Migrate backend code to SQLAlchemy #303

Open maxachis opened 5 months ago

maxachis commented 5 months ago

SQLAlchemy is an Object-Relational Mapping (ORM) tool that represents SQL database actions in Python code.

Why use SQLAlchemy?

Why not use SQLAlchemy?

My thoughts

Resources:

maxachis commented 3 months ago

@EvilDrPurple When we get further along, I'd like to look into SQLAlchemy's connection pooling functionality. #376 occurred in part because we only had one active connection being managed at a time, so having multiple connections in a connection pool might be a way to resolve the problem in a more thorough way than the one I offered for that issue.

This is not a remotely high priority at the moment, but something I wanted to keep on the radar for eventual exploration.

maxachis commented 3 months ago

So after attempting to integrate the changes @EvilDrPurple made with the ones I'm making in #318, I identified a few issues that weren't immediately apparent when just looking at the SQLAlchemy changes in isolation:

The end result of all of this is that it turns out that even a piecemeal implementation of SQLAlchemy becomes considerably more difficult, because too much logic is tightly coupled in the code base at the moment to enable easy shifts to a new way of interacting with the database, and SQLAlchemy (or at least Flask-SQLAlchemy) in particular encourages the use of global variables that make reconciling changes more challenging.

The way I see it, we have a few options here, taking into account the current limitations as well as that other, important changes like #318 are also in the pipeline:

  1. Try to reconcile the existing #318 code to SQLAlchemy. That's what I was working on this morning, but after a few hours my progress was quite slow, and spaghetti code was being produced.
  2. Rollback the code and reconfigure the SQLAlchemy changes to minimally affect the rest of the code. A lot of the most important SQLAlchemy logic can probably be preserved, but new work will be required. Auth code will be pushed forward, and SQLAlchemy changes will need to account for that.
  3. Rollback the code and table SQLAlchemy for now until a later point when things in the code base are generally less coupled, making it easier to implement SQLAlchemy logic.
  4. Rollback the code and table SQLAlchemy, possibly indefinitely.
josh-chamberlain commented 3 months ago

@maxachis your instinct about priority is right—the auth work must continue so that API changes may continue. SQLAlchemy is a good idea, but...but. It does make sense that it wouldn't play nice with Psycopg2, and it seems like Psycopg2 would be made unnecessary with SQLAlchemy?

maxachis commented 3 months ago

@maxachis your instinct about priority is right—the auth work must continue so that API changes may continue. SQLAlchemy is a good idea, but...but. It does make sense that it wouldn't play nice with Psycopg2, and it seems like Psycopg2 would be made unnecessary with SQLAlchemy?

Exactly. We only need Psycopg2 or SQLAlchemy, and the only reason for having both is because we're transitioning from one to the other. But part of that transition is that we want to minimize the friction in that transition, and have as little be upset as possible.

I do want to get @EvilDrPurple 's feedback on this, but my sense is that we will need to rollback. It's one thing if we only have to change some functions in DatabaseClient and maybe a few other limited areas, but it's another thing if changes to DatabaseClient necessitate considerable changes to app.py, the tests, and the fixtures.

On a long enough timescale, the time spent overhauling everything to SQLAlchemy right now would eventually, I believe, pay for itself. But currently our timescale is short because we want to get v2 out the door!

josh-chamberlain commented 3 months ago

ok! As the two back end devs, y'all have my permission to make the transition abrupt/friction-ful, because we're protected by the fork and dev environment. Thanks for thinking this through.

maxachis commented 3 months ago

Alright! For the moment, we are reverted, and @EvilDrPurple is going to look into a more modular approach while I work on auth!

Current plan is as follows: