GSA / notifications-api

The API powering Notify.gov
Other
10 stars 2 forks source link

Convert models.py classes to using SQLAlchemy 2.0 standards #1246

Open xlorepdarkhelm opened 3 months ago

xlorepdarkhelm commented 3 months ago

Now that we are using SQLAlchemy 2.0, we should begin the process of moving away from the deprecated SQLAlchemy 1.x method of doing things, and moving into the new SQLAlchemy 2.x form. To start, the models.py file needs some updating.

The first change for this is rewriting all of the declared columns in the models from the old syntax of

id = db.Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)

And change it to using the new mapper syntax of:

id: Mapped[UUID(as_uuid=True)] = mapped_column(primary_key=True, default=uuid.uuid4)

The important thing is that the type is now handled in the Mapped[] type definition, and that db.Column is replaced with mapped_column. This will bring the models up to SQLAlchemy 2.0 standards, and should be a direct conversion with no expected issues.

Note: flask-sqlalchemy should handle getting the db.Model configured correctly for this new syntax.

Note 2: SQLAlchemy 2.0 prefers using the standard Python types (str, int, etc) over using the SQLAlchemy column type equivalents (db.String, db.Integer, etc).

Ref: https://docs.sqlalchemy.org/en/20/orm/quickstart.html

ccostino commented 3 months ago

@xlorepdarkhelm given the amount of code in this file, does it make sense to split this work apart a bit and focus on smaller sections at a time?

I'm also assuming that the acceptance criteria ought to be something like this:

Does that look right to you? Is there anything else you can think of? Thanks!

xlorepdarkhelm commented 3 months ago

That seems like goot acceptance criteria.

And I can see this being split up. I don't believe that there would be any issue with only part of the models being converted. In theory, they can work side-by-side.

ccostino commented 3 months ago

Yeah, I think it'd have to be split up, there's too much otherwise. This may become its own epic at this level then and we can divide and conquer with the issues that roll-up into it. If you think you have a good breakdown of which groups of models to focus on, let's list them out here and plan from there. 🙂

xlorepdarkhelm commented 3 months ago

To migrate the code to SQLAlchemy 2.0 style using Flask-SQLAlchemy, several changes that align with the updated APIs and best practices of SQLAlchemy 2.0 are needed. Below are the key changes you should consider for each part of the code:

1. Use of Mapped and Type Hints

SQLAlchemy 2.0 encourages the use of type hints to clearly define column types and relationships. This involves replacing db.Column, db.relationship, and other constructs with appropriate type hints.

Example:

from sqlalchemy.orm import Mapped, mapped_column, relationship

class User(db.Model):
    __tablename__ = "users"

    id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    name: Mapped[str] = mapped_column(db.String, nullable=False, index=True)
    email_address: Mapped[str] = mapped_column(db.String(255), nullable=False, index=True, unique=True)
    # Other fields and relationships...
    services: Mapped[list["Service"]] = relationship("Service", secondary="user_to_service", back_populates="users")
    organizations: Mapped[list["Organization"]] = relationship("Organization", secondary="user_to_organization", back_populates="users")

2. Relationships

Replace db.relationship and backref with explicit relationship and back_populates for better clarity.

Example:

class Service(db.Model):
    __tablename__ = "services"

    id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    name: Mapped[str] = mapped_column(db.String(255), nullable=False, unique=True)
    created_by_id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), db.ForeignKey("users.id"), nullable=False)
    created_by: Mapped["User"] = relationship("User", back_populates="services")
    # Other fields and relationships...

And in the User model:

class User(db.Model):
    __tablename__ = "users"

    id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    services: Mapped[list["Service"]] = relationship("Service", back_populates="created_by")

3. Use of Mapped for Collections

For collections or lists of related items, use Mapped with type hints.

Example:

class Service(db.Model):
    __tablename__ = "services"

    service_sms_senders: Mapped[list["ServiceSmsSender"]] = relationship("ServiceSmsSender", back_populates="service")
    templates: Mapped[list["Template"]] = relationship("Template", back_populates="service")

4. Constraints and Indexes

Constraints and indexes don't change significantly, but ensure they are clearly defined.

Example:

__table_args__ = (
    db.UniqueConstraint("user_id", "service_id", name="uix_user_to_service"),
)

5. Enum and Other Columns

Enum columns and other specialized columns should remain mostly the same but ensure they're well-defined with Mapped.

Example:

auth_type: Mapped[AuthType] = mapped_column(
    db.Enum(AuthType, name="auth_types"),
    index=True,
    nullable=False,
    default=AuthType.SMS
)

6. Class Methods and Properties

SQLAlchemy 2.0 encourages clear class methods, so ensure all methods align with the 2.0 syntax.

Example:

@validates("mobile_number")
def validate_mobile_number(self, key: str, number: str) -> str:
    try:
        if number is not None:
            return validate_phone_number(number, international=True)
    except InvalidPhoneError as err:
        raise ValueError(str(err)) from err

7. Sessions

When dealing with sessions, SQLAlchemy 2.0 prefers direct usage of Session. Flask-SQLAlchemy abstracts this for you, so if you're using db.session, you're generally fine.

Summary of Key Changes:

By systematically applying these changes, you'll gradually upgrade your models to SQLAlchemy 2.0 style while maintaining compatibility with Flask-SQLAlchemy.

xlorepdarkhelm commented 3 months ago

Stage 1: Foundational Models

  1. User:
    • It's central and many other models depend on it.
    • Transitioning User will impact several related models.
  2. Organization:
    • It’s another foundational model that is used by others like Service and EmailBranding.

Stage 2: Primary Functional Models

  1. Service:
    • A key model in the application, closely related to User and Organization.
  2. ServiceUser:
    • Dependent on both User and Service.
    • Logical to update after both User and Service are updated.
  3. ServicePermission:
    • Related to Service and important for service functionality.

Stage 3: Secondary Functional Models

  1. TemplateBase, Template, TemplateHistory:
    • These are closely related and handle template management.
    • They depend on Service, which should be updated first.
  2. ApiKey:
    • Used for managing API keys in services.
    • Should be updated after Service.
  3. Job:
    • A core part of the functionality that also relates to Template.
    • Should follow after templates and services.

Stage 4: Notification System

  1. Notification, NotificationHistory:
    • Handles notifications, which depend on Job, Template, and Service.
  2. ServiceInboundApi, ServiceCallbackApi:
    • Directly related to notifications and the service layer.

Stage 5: Administrative and Supplementary Models

  1. EmailBranding:
    • Linked with Organization and Service.
  2. Agreement:
    • Handles agreements, related to Organization.
  3. InboundNumber, ServiceSmsSender:
    • Related to messaging functionality, which connects to services.

Stage 6: User and Organization Invitations

  1. InvitedUser, InvitedOrganizationUser:
    • Dependent on User and Organization.

Stage 7: Ancillary Models

  1. VerifyCode:
    • Relates to user verification and is less connected to other core models.
  2. WebauthnCredential:
    • Connected to User for handling WebAuthn credentials.
  3. Event:
    • A generic model that can be updated after most others are stabilized.

Stage 8: Reporting and Historical Models

  1. FactBilling, FactNotificationStatus, FactProcessingTime:
    • Used for reporting and analytics, dependent on Notification and Service.
  2. Complaint:
    • Relates to notifications and should be updated once the notification system is stable.
  3. ServiceDataRetention:
    • Related to data retention policies, ties into Service.

Stage 9: Final Cleanups

  1. Rate:
    • A simpler model with fewer dependencies, can be handled last or earlier in parallel with others.
  2. Miscellaneous Updates:
    • Any remaining utility models or functions that were left behind.

Summary

The logical progression is to start with foundational models (User, Organization), followed by primary functional models (Service, ServiceUser, ServicePermission), then move on to secondary functionality (Template, ApiKey, Job). After that, update the notification system, followed by administrative models and finally ancillary and reporting models.

xlorepdarkhelm commented 3 months ago

I made the separate issues (above) for the breakdown for which models to do. Due to the nature of this kind of change, these need to be done in synchronous order, as many of them are relying on the previous step(s) to be completed. I have noted each as such in their descriptions.

ccostino commented 3 months ago

Awesome, thank you for this write-up @xlorepdarkhelm, including the examples and summary! Thank you for splitting the issues apart as well. Given that, I'm going to convert this issue as the parent to an epic and then we'll pull the other issues in to the board as things get prioritized/planned for with future sprints.

The only thing I'd amend in these is making sure the issues also have a Security Considerations section in them after the Acceptance Criteria. I imagine in most cases this is lightweight as there likely aren't any, but we should be explicit about that regardless and make sure we're thinking things through. 🙂

xlorepdarkhelm commented 3 months ago

The only thing I'd amend in these is making sure the issues also have a Security Considerations section in them after the Acceptance Criteria. I imagine in most cases this is lightweight as there likely aren't any, but we should be explicit about that regardless and make sure we're thinking things through. 🙂

I added security considerations to each of the sub-issues for this epic.