cunybpl / aiodal

async data access layer and related tools for sqlalchemy core
0 stars 0 forks source link

`reflect` not correctly working with `schema` #28

Closed bsnacks000 closed 7 months ago

bsnacks000 commented 7 months ago

This was something we glossed over since we mostly use seperate databases and single public schema. So there is usually only 1 MetaData instance that is relevant for any session.

For some projects though we are using multiple schemas and the way we have it configured is not working.

class CustomDataAccessLayer(dal.DataAccessLayer):
    async def reflect(
        self,
        engine: AsyncEngine,
        metadata: Optional[sa.MetaData] = None,
        schema: str | None = None,
        views: bool = True,
        only: Sequence[str] | None = None,
        extend_existing: bool = False,
        autoload_replace: bool = True,
        resolve_fks: bool = True,
        **dialect_kwargs: Any,
    ) -> None:

        if not metadata:
            metadata = sa.MetaData()

        def _reflect(con: sa.Connection) -> sa.Inspector:
            assert metadata is not None

            metadata.reflect(   # <-----------this can be called with multiple schemas it turns out 
                bind=con,
                views=views,
                schema=schema,
                only=only,
                extend_existing=extend_existing,
                autoload_replace=autoload_replace,
                resolve_fks=resolve_fks,
                **dialect_kwargs,
            )
            inspector = sa.inspect(con)
            tablenames = [t.name for t in metadata.sorted_tables]

            for t in tablenames:
                ucs = inspector.get_unique_constraints(
                    t, schema=metadata.schema   # <-------- had to add this here  
                )  
                for uc in ucs:
                    self._constraints[t] = uc["column_names"]
            return inspector

        async with engine.connect() as conn:
            self._inspector: sa.Inspector = await conn.run_sync(_reflect)

        self._engine = engine
        self._metadata = metadata

This post explains the behavior better then the actual docs: https://stackoverflow.com/questions/47803343/reflecting-every-schema-from-postgres-db-using-sqlalchemy#:~:text=In%20order%20to%20reflect%20other,reflect()%20for%20each%20schema.&text=Any%20conflicting%20table%20names%20will,names%20with%20the%20schema%20name.

So what I think we want is to modify the signature of this method for schema to be str | Sequence[str] | None and then introspect. We want to try to refactor this so we can get the same behavior regarding cataloging unique constraints for tables across schemas.

bsnacks000 commented 7 months ago

As per discussion:

Maybe remove this code from _reflect:

            tablenames = [t.name for t in metadata.sorted_tables]
            for t in tablenames:
                ucs = inspector.get_unique_constraints(t)  # list[dict]
                for uc in ucs:
                    self._constraints[t] = uc["column_names"]

instead _reflect callback responsibility is to just use the connection object to make the neccessary reflections based on user input and provide an inspector on the db ...

Then we rewrite this one:

    def get_unique_constraint(self, name: str) -> List[str]:
        """Get a unique constraint by key name. This works well if you name your unique constraints
        which you really should.

        Args:
            name (str): The name of the constraint

        Returns:
            List[str]: A list of the column names associated with that constraint.
        """
        return self._constraints[name]

To use self._inspector dynamically to get unique contstraints if needed... and since we proxy all calls through TransactionManager we should not get a public API break at all...