falcony-io / sqlalchemy-searchable

Fulltext searchable models for SQLAlchemy. Only supports PostgreSQL
Other
260 stars 44 forks source link

ProgrammingError: (psycopg.errors.UndefinedFunction) function parse_websearch #154

Open a1d4r opened 3 months ago

a1d4r commented 3 months ago

When I try to execute a search query, I get the following error:

ProgrammingError: (psycopg.errors.UndefinedFunction) function parse_websearch(character varying, 
character varying) does not exist
LINE 4: ...cts.active IS true AND (products.search_vector @@ parse_webs...
                                                             ^
HINT:  No function matches the given name and argument types. You might need to add explicit type 
casts.
[SQL: SELECT count(*) AS count_1 
FROM (SELECT products.name AS name, products.category AS category, products.brand AS brand, 
products.article AS article, products.ean_code AS ean_code, products.search_vector AS search_vector,
products.id AS id, products.sa_orm_sentinel AS sa_orm_sentinel, products.created_at AS created_at, 
products.updated_at AS updated_at, products.deleted_at AS deleted_at, products.active AS active, 
products.external_id AS external_id, products.import_id AS import_id, products.extra_data AS 
extra_data 
FROM products 
WHERE products.active IS true AND (products.search_vector @@ 
parse_websearch(%(parse_websearch_1)s::VARCHAR, %(parse_websearch_2)s::VARCHAR))) AS anon_1]
[parameters: {'parse_websearch_1': 'pg_catalog.english', 'parse_websearch_2': 'example'}]
(Background on this error at: https://sqlalche.me/e/20/f405)

I checked the functions defined in the database:

create function parse_websearch(config regconfig, search_query text) returns tsquery
create function parse_websearch(search_query text) returns tsquery

It seems like the first argument type (regconfig) is wrong, and the query fails.

I managed to execute the query manually, removing the cast to VARCHAR:

parse_websearch('pg_catalog.english'::VARCHAR, 'example'::VARCHAR))) AS anon_1]

to

parse_websearch('pg_catalog.english', 'example'::VARCHAR))) AS anon_1]

Am I missing something in SQLAlchemy set up? I defined the base class Base, and then run: make_searchable(Base.metadata). After describing all SQLAlchemy models, I run: configure_mappers(). In addition, because I use alembic, and I added these command to the migration: op.execute(sql_expressions) and sync_trigger

I use: PostgreSQL 16.0 psycopg 3.1.18 sqlalchemy 2.0.29 sqlalchemy-searchable 2.1.0

liamvdv commented 1 month ago

Solution at https://stackoverflow.com/a/78402972

liamvdv commented 1 month ago

The issue it that engine.create_all is patched by this library to register additional functions. Alembic never calls engine.create_all, thus the relevant functions are never used.

Also, autogenerated alembic migrations no not nearly detect all relevant changes, thus a complete example below.

from sqlalchemy_utils.types import TSVectorType

class Base(DeclarativeBase):
     pass

make_searchable(Base.metadata)
engine = ....
# NOTE(liamvdv): needed for sqlalchemy_searchable
# https://sqlalchemy-searchable.readthedocs.io/en/latest/quickstart.html#create-and-populate-the-tables
configure_mappers()

class MessageLog(Base):
     __tablename__ = "message_log"
     question: Mapped[str] = mapped_column(String)
     answer: Mapped[str] = mapped_column(String)
     search_vector: Mapped[TSVectorType] = mapped_column(
        TSVectorType("question", "answer", regconfig="pg_catalog.simple", weights={"question": "A", "answer": "B"})
    )

in alembic:

def upgrade() -> None:
    conn = op.get_bind()
    # https://stackoverflow.com/a/78402972
    # Direct use of SQLAlchemy-Seearchable's sql_expressions to avoid the need to import the model
    op.execute(sql_expressions.statement)

    # ### commands auto generated by Alembic - please adjust! ###
    op.add_column(
        "message_log", sa.Column("search_vector", sqlalchemy_utils.types.ts_vector.TSVectorType(), nullable=True)
    )
    op.create_index(
        "ix_message_log_search_vector", "message_log", ["search_vector"], unique=False, postgresql_using="gin"
    )
    sync_trigger(
        conn,
        "message_log",
        "search_vector",
        ["question", "answer"],
        # Only use this if you have non-english, multilingual text in your column. Else prefer the given language for even better stemming. Do not use weights if not needed by your app.  
        options={"regconfig": "pg_catalog.simple", "weights": {"question": "A", "answer": "B"}},
    )

    # ### end Alembic commands ###

def downgrade() -> None:
    conn = op.get_bind()
    drop_trigger(conn, "message_log", "search_vector")

    # ### commands auto generated by Alembic - please adjust! ###
    op.drop_index("ix_message_log_search_vector", table_name="message_log", postgresql_using="gin")
    op.drop_column("message_log", "search_vector")

    # The database schema ("public.") is useless, because
    #   1) The schema name may not be always be "public"; and
    #   2) The connection already knows which DB we're working with
    op.execute("DROP FUNCTION parse_websearch(regconfig, text);")
    op.execute("DROP FUNCTION parse_websearch(text);")
    # ### end Alembic commands ###

Note that you can remove weights and regconfig if you do not use them in your TSVectorType config. See documentation of this library / postgres for explanation of the values.