Open asfimport opened 12 years ago
Michael Sokolov (@msokolov) (migrated from JIRA)
There is a real need for this. The surround parser in particular is very poorly documented. There are some quirks that aren't at all apparent, like: distance has a max of 99, and when distance > 99, you actually get a generic syntax error because only two digits are allowed in the token. Also: the default distance is one, but 1 is actually not allowed as an explicit distance - you can only specify it implicitly. Finally the distance is off-by-one from the slop parameter in the Span queries it ultimately generates. Distance of 0 in Spans == distance of 1 in the surround query syntax. And aside from all of these specific issues, there is just a general lack of any sort of general statement about what the allowable syntax is. The best documentation is the javacc source, of course, but that's a bit unapproachable for a lot of folks that might actually use this, I think.
Chris Male (migrated from JIRA)
You raise an interesting point, some of the error messaging from the QPs is poor. I've been in a situation where users were able to express complex queries themselves but would often be confused by the error messages they received if a query didn't parse. Some of this is related to the parsing frameworks we use, some of it is that we could just do better. I think we'll tackle this in another issue but it's definitely part of the overall goal to give the QPs a big facelift.
Would you be able to tackle the surround parser documentation? You seem to have experience using it (I haven't) and understand its quirks. Just throw up a patch and we'll iterate.
Michael Sokolov (@msokolov) (migrated from JIRA)
I attached a small documentation patch for the surround parser. It isn't comprehensive, but gives a brief summary of all the operators as far as I understand them.
Chris Male (migrated from JIRA)
Hey Mike,
I think your patch is created against an older version of trunk since it contains modules/
which we no longer have. Could you update and create it again?
Michael Sokolov (@msokolov) (migrated from JIRA)
Oops - it's been a while since I updated. Here's a patch against today's trunk.
Robert Muir (@rmuir) (migrated from JIRA)
Thanks Mike, I committed this!
In #5097 there were some changes to the fuzzy query syntax. Only the Classic QueryParser really documents its syntax, which makes it hard to know whether the changes effected other QPs. Compounding this issue there are many classes which have no javadocs at all and I found myself quite confused when I consolidated all the QPs into their module.
We should do a concerted effort to improve the documentation so that it is clear what syntax is supported by what QPs and so that at least the user facing classes have javadocs.
As part of this, I wonder whether we should give the syntax supported by the Classic QueryParser a new name (rather than just Lucene's query syntax) since other QPs can and do support other syntax, and then somehow add some typed control over this, so QPs have to declare programmatically that they support the syntax and so we can verify that by randomly plugging in implementations into tests.
Migrated from LUCENE-4040 by Chris Male, 2 votes, updated May 11 2012 Attachments: LUCENE-4040.patch (versions: 2)