jackalope / jackalope-doctrine-dbal

Doctrine DBAL transport implementation for Jackalope
http://jackalope.github.io
Other
144 stars 60 forks source link

Full text search constraint shouldn't assume "%" #212

Open dantleech opened 10 years ago

dantleech commented 10 years ago

Currently the full-text-search automatically wraps the query in "%".

In jackrabbit you need to delimit the string with * manually, it doesn't happen automatically.

We should substitute * with % in the full text search I think.

lsmith77 commented 10 years ago

sounds like a good idea to me

dbu commented 10 years ago

how exactly does it work with jackrabbit? when do CONTAINS('d.foo', 'text') i would find anything that contains the fragment "text", no? or would i need to do CONTAINS('d.foo', '*text*')?

looking at http://jackrabbit.510166.n4.nabble.com/jcr-sql2-contains-full-text-search-not-working-td4655655.html i have no clue anymore with wildcards or not wildcard or if by default, CONTAINS would assume exact word matching or what. if the semantics is correct that * denotes wildcard matching in SQL2 contains then yes, we should probably replace that by % instead of automatically making exact matches wildcard matches.

dantleech commented 10 years ago

It seems to match words by default:

PHPCRSH > ls /cms/articles/article1
+-----------------+--------+--------------------------------------+
| title           | STRING | Article 1                            |
+-----------------+--------+--------------------------------------+
PHPCRSH > select * from [nt:unstructured] where contains(title, 'Article 1');
+------------------------+---------------------------------+
| Path                   | nt:unstructured.jcr:primaryType |
+------------------------+---------------------------------+
| /cms/articles/article1 | nt:unstructured                 |
+------------------------+---------------------------------+
1 rows in set (0.01 sec)
PHPCRSH > select * from [nt:unstructured] where contains(title, 'Article');
+------------------------+---------------------------------+
| Path                   | nt:unstructured.jcr:primaryType |
+------------------------+---------------------------------+
| /cms/articles/article1 | nt:unstructured                 |
+------------------------+---------------------------------+
1 rows in set (0.01 sec)
PHPCRSH > select * from [nt:unstructured] where contains(title, 'ticle');
+------+
| Path |
+------+
0 rows in set (0.01 sec)
PHPCRSH > select * from [nt:unstructured] where contains(title, 'Arti');
+------+
| Path |
+------+
0 rows in set (0.01 sec)
PHPCRSH > select * from [nt:unstructured] where contains(title, 'Arti*');
+------------------------+---------------------------------+
| Path                   | nt:unstructured.jcr:primaryType |
+------------------------+---------------------------------+
| /cms/articles/article1 | nt:unstructured                 |
+------------------------+---------------------------------+
1 rows in set (0.01 sec)
PHPCRSH > select * from [nt:unstructured] where contains(title, '*Arti*');
+------------------------+---------------------------------+
| Path                   | nt:unstructured.jcr:primaryType |
+------------------------+---------------------------------+
| /cms/articles/article1 | nt:unstructured                 |
+------------------------+---------------------------------+
1 rows in set (0.01 sec)
PHPCRSH > select * from [nt:unstructured] where contains(title, '1');
+------------------------+---------------------------------+
| Path                   | nt:unstructured.jcr:primaryType |
+------------------------+---------------------------------+
| /cms/articles/article1 | nt:unstructured                 |
+------------------------+---------------------------------+
1 rows in set (0.01 sec)
dbu commented 10 years ago

thanks dan. yeah then lets not automatically add % ever, but replace * by %. this is a BC break for jackalope-doctrine-dbal users however, they previously had partial matching by default... if you do a PR to fix this, please don't forget to add a note to the changelog. i consider it a bugfix, but it can still surprise people...