Closed JPMoresmau closed 7 years ago
There are already org.umlg.sqlg.predicate.Text
That might cover what you are looking for already.
No, these predicates become things like LIKE in SQL, not the full text operators @@ and such. But these Text predicates are a good guide!
Ok, just looked at the link you sent, not really familiar with that syntax.
Perhaps on AbstractLabel.ensureIndexExists
we can add a IndexType.GIN
?
Then the standard Topology creation api can support GIN indexes.
Mmmmhhh, but the column clause of a Gin index is not a list of columns, but an expression. So we can't model that via a list of PropertyColumn.
Also, the index expression could be a concatenation of columns, say, and you can also search on that same expression for the index to be used. But currently I just use a predicate, so to use with has() we need to provide one column name. Ideally we should be able to provide the full search expression. I'm not aware of any mechanism in Gremlin to express a vertex-level predicate?
Here is my reading of it.
GIN index column must be of type tsvector
. We can assume that to_tsvector
will always be on a column or a list of columns. AbstractLabel.ensureIndexExists
takes a list of columns.
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', title || ' ' || body));
can become
pgwebVertexLabel.ensureIndexExists(IndexType.GIN, Arrays.toList(titlePropertyColumn, bodyPropertyColumn))
The information that is not specified is that 'english'
bit. Not sure how to get around that. Maybe some kind of special PropertyColumn
so we can write, either with an actual real config_name column or a jippo one
//there is an actual config_name column
pgwebVertexLabel.ensureIndexExists(IndexType.GIN, Arrays.toList(configNamePropertyColumn, titlePropertyColumn, bodyPropertyColumn))
or
//a fake PropertyColumn
pgwebVertexLabel.ensureIndexExists(IndexType.GIN, Arrays.toList(PropertyColumn.ENGLISH_GIN, titlePropertyColumn, bodyPropertyColumn))
Does it make sense or am I missing something?
This is what I'm doing:
I of course need to check I haven't broken anything.
One question: when there's a query with or(__.has('name','val1'),__.has('name','val2'))
, this doesn't get generated in SQL, right? That's another big improvement that could be added.
Yes or
does not get optimized. Nor where
nor match
. or
is complicated. In fact I recall on the forums someone saying its not possible with a fluent api to have the same semantics as a declarative language like sql. I have not thought about it for a while but I what recall being difficult is to map sql's open close bracket semantics together with and
and or
s to gremlin. I think in gremlin they simulate it instead with the match
clause.
But yeah for-sure it will be a big improvement to have some kind of or
support.
Done
There is no support in Gremlin for full-text search predicates, but we could add our own predicates that would use the full text PostgreSQL operators. This would be very useful for us (mix normal Gremlin queries with full text search).
See https://www.postgresql.org/docs/current/static/textsearch-tables.html#TEXTSEARCH-TABLES-SEARCH. I suppose we should just offer predicates and hope that the full text indices have been created outside of sqlg.