Closed CGenie closed 2 months ago
Sorry about this taking a long time. I have a lot of other things in my queue. Feel free to ping me if you think it's taking too long.
So far we're using our own branch for this. No worries, take your time, it's how open source works, we all have life and work to do :)
Hi @CGenie, sorry again that this seems to have escaped my attention. Is this something you're still using and want to get into Opaleye?
Yes, we are still using it and yes, I think a library that advertises itself as "targeting PostgreSQL" should have postgres arrays, json and FTS, in particular.
OK, I'll get this merged. Can you please clarify whether you actually want functions of type
Field SqlText -> Field SqlTSVector
Field SqlText -> Field SqlTSQuery
or if it's sufficient to have
String -> F.Field P.SqlTSQuery
String -> F.Field P.SqlTSVector
Then I'll decide where best to put them. If the former, I think they belong in Opaleye.Operators
.
(I don't use this functionality so I don't know how it's normally used)
We use it like this:
restrict -< (_ns_search row) @@ (sqlToTSQuery (unpack q))
I guess it's easier to have that wrapped into a field as we compare fields directly and if you want to provide a raw string, just wrap it in sqlString
. I think such wrapping is simpler than unwrapping (as one should handle the null
case, no?)
Basically, full text search is similar to the "LIKE"
operator, it is more powerful though as it can tokenize text (hence the TSVector
column) which is language-specific. See https://www.postgresql.org/docs/current/textsearch-controls.html. Having the tsvector
column gives us pre-computed vectors of text tokens and then we can search large amounts of texts quite fast. So tsvector
is like a special index column for text.
I guess some info can be found here: https://stackoverflow.com/a/224726
One can also present highlighted results (https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-HEADLINE) but we don't use it at this moment.
Hopefully I gave some insight into this feature :)
Thanks! I don't see where sqlToTSQuery
is defined though. It's not in this PR.
Thanks. After playing around with it I think to expose the full set of functionality you probably want
(@@) :: Field T.SqlTSVector -> Field T.SqlTSQuery -> F.Field T.SqlBool
tsvector :: Field T.SqlString -> Field T.SqlVector
to_tsvector :: Field T.SqlString -> Field T.SqlVector
tsquery :: Field T.SqlString -> Field T.SqlVector
to_tsquery :: Field T.SqlString -> Field T.SqlVector
Would that work?
And @CGenie, when I add this, can I list you as the "owner" of this functionality, to be summoned if anyone has a question about it?
Yes, you can set me as the "owner" of this functionality.
About the functions you listed: what's the difference between tsvector
and to_tsvector
? Same for tsquery
and to_tsquery
? Types seem to be the same.
It looks like the to_
functions do normalization, as explained at https://www.postgresql.org/docs/current/datatype-textsearch.html. For example
postgres=# SELECT tsvector('Cat Fat Fat Fat');
tsvector
-------------
'Cat' 'Fat'
(1 row)
postgres=# SELECT to_tsvector('Cat Fat Fat Fat');
to_tsvector
---------------------
'cat':1 'fat':2,3,4
(1 row)
Actually there are a few more: plainto_tsquery
, phraseto_tsquery
and websearch_to_tsquery
(https://www.postgresql.org/docs/16/textsearch-controls.html) and we make use of them as well. I guess tsvector
, to_tsvector
, tsquery
, to_tsquery
, plainto_tsquery
, phraseto_tsquery
, websearto_to_tsquery
as well as (@@)
would make for a more or less complete functionality.
I see. This gets more and more complicated and I'm doubting my ability to review or maintain this functionality. I think my recommendation would be that interested parties make an opaleye-textsearch
package that exposes the functionality they want (using the OpOther
constructor of BinOp
rather than adding (:@@)
) and then incorporating it into Opaleye after it's proven itself out in practice, say after a year or so.
How does that sound?
OK, I'll try to follow this recommendation then. Thanks!
If anyone is interested, the abovementioned package is here https://hackage.haskell.org/package/opaleye-textsearch-0.1.0.0
Continuation of https://github.com/tomjaguarpaw/haskell-opaleye/pull/561 as I don't know how to change the base branch in an existing merge request.