apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.7k stars 1.04k forks source link

Solr LocalParams for Lucene Query Parser [LUCENE-4271] #5340

Open asfimport opened 12 years ago

asfimport commented 12 years ago

The Lucene QueryParser should implement Solr's LocalParams syntax directly so that instead of

_query_:"{!geodist d=10 p=20.5,30.2}"

one could directly use

{!geodist d=10 p=20.5,30.2}

references: http://wiki.apache.org/solr/LocalParams


Migrated from LUCENE-4271 by Yonik Seeley (@yonik), updated Nov 19 2012 Attachments: LUCENE-4271.patch Linked issues:

asfimport commented 12 years ago

Chris Male (migrated from JIRA)

Will a query have a single set of params, or will each clause potentially have its own?

asfimport commented 12 years ago

Yonik Seeley (@yonik) (migrated from JIRA)

Each clause would have it's own... this is really about adding a LocalParams style query as a clause. It's really just about the ability to leave off the query magic field name and quotes in the current style of embedding one can already do in Solr.

Complex example:

+foo -bar +{!baz arg=val} word -{!raw f=myfield v="the raw term"} +anotherword
asfimport commented 12 years ago

Chris Male (migrated from JIRA)

Is the intention to mandate the !var syntax too? That seems like a pretty Solr specific thing (being able to delegate the parsing to a QParser) but I can imagine someone just wanting a map of values, e.g.

{arg_1=val_1 arg_2=val_2} word
asfimport commented 12 years ago

Yonik Seeley (@yonik) (migrated from JIRA)

Is the intention to mandate the !var syntax too?

Yes, that's what this issue is about. A lot of Solr users try to use the {!foo=bar} syntax directly in the lucene query syntax without realizing there is not direct support today (i.e. there is only the indirect _query_ hack to get it to work)

asfimport commented 12 years ago

Yonik Seeley (@yonik) (migrated from JIRA)

Since I've opened this issue, I've seen even more people try to use {!foo} syntax directly in a lucene query string.

Here's a patch that implements the syntax. For lucene, it simply calls getFieldQuery and will hence just treat the symbols as text by default. For Solr, it invokes sub-QParser logic.

Seems to work fine, but I still need to add tests.

asfimport commented 12 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

I think it's odd to add syntax to Lucene's query parser that does ... nothing?

And it's strange to make Lucene's QP aware of Solr QP's syntax if it cannot do anything with it. It seems like Solr's QP should have this logic instead ...

Since I've opened this issue, I've seen even more people try to use {!foo} syntax directly in a lucene query string.

I'm confused: how would this patch help such users? Ie if they are using Lucene's QP this patch doesn't seem to help them ... if users are trying to use Solr's syntax with Lucene's query parser, it seems like we should make it easier for them to use Solr's query parsers (factor them out)...?

Is the intention to mandate the !var syntax too? That seems like a pretty Solr specific thing (being able to delegate the parsing to a QParser) but I can imagine someone just wanting a map of values, e.g.

{arg_1=val_1 arg_2=val_2} word

Right: if we added functionality to Lucene's QP that parsed those arg/val's to an accessible Map<String,String> ... I think that would be a useful addition? I agree we really shouldn't bake in Solr specific syntax into Lucene's QP ...

asfimport commented 12 years ago

Yonik Seeley (@yonik) (migrated from JIRA)

It seems like Solr's QP should have this logic instead ...

Indeed - but it requires changes to the parser grammar, so subclassing doesn't cut it. I suppose the next best thing would be to make a QP specific to Solr.

asfimport commented 12 years ago

Jack Krupansky (migrated from JIRA)

I don't see anything wrong with supporting parameterized queries and multiple query parsers down at the lowest level of the Lucene Query Parser. I don't mean to suggest that the Lucene Query Parser should know directly about the Solr-level structures such as the Solr schema, Solr "params", and Solr Q Parser plugins, but I am suggesting that Lucene could declare and support abstractions for those sorts of interfaces, and then the Solr-level Qparser/plugin would supply all of them, and pure Lucene application could optionally use them as well, either with additional query parser methods or subclassing. This should also include full support for dismax fields and aliases as well. and things like support for full negative queries.

There are lots of useful features which are available in the Solr query parsers which are unavailable directly to Lucene apps without a lot of effort, and for no good reason.

I mean, as long as these features don't impact performance or ease of programming for simple Lucene apps, what possible legitimate objection could there be?

The current "estrangement" between the Lucene and Solr query parsers is quite a black eye for Lucene/Solr that can easily be remedied, at least from a technical perspective.

In the context of this current Jira, why not provide an additional Lucene query parser method to provide a simple keyword/value map of "parameters" that can be substituted into a query? Seems to make solid sense to me!

asfimport commented 12 years ago

Chris Male (migrated from JIRA)

I think it's odd to add syntax to Lucene's query parser that does ... nothing?

And it's strange to make Lucene's QP aware of Solr QP's syntax if it cannot do anything with it. It seems like Solr's QP should have this logic instead ...

+1

Indeed - but it requires changes to the parser grammar, so subclassing doesn't cut it. I suppose the next best thing would be to make a QP specific to Solr.

I don't think we should consider that a bad thing. Solr has different needs and the classic QP is sort of the lowest common denominator of parsers.

I don't mean to suggest that the Lucene Query Parser should know directly about the Solr-level structures such as the Solr schema, Solr "params", and Solr Q Parser plugins, but I am suggesting that Lucene could declare and support abstractions for those sorts of interfaces

I don't think we can practical extend the classic QP in every way just to meet Solr's needs.

There are lots of useful features which are available in the Solr query parsers which are unavailable directly to Lucene apps without a lot of effort, and for no good reason.

.. then the Lucene apps should use the Solr QPs or a version there of. The Classic QP was moved out of Lucene core for many reasons, but one was to combat this perspective that its 'the' QP when it is in fact just one particular implementation (an implementation which has lots of limitations). Users should be encouraged to use whatever QP meets their needs and we shouldn't make the classic QP a kitchen sink.

The current "estrangement" between the Lucene and Solr query parsers is quite a black eye for Lucene/Solr that can easily be remedied, at least from a technical perspective.

I think we should go further and fully divorce them. Solr has its needs and the handling of LocalParams clearly seems to be confusing users but it isn't something the classic QP should have to resolve. Equally, Solr development shouldn't be saddled with having to compromise its query features just so they fit into the classic QP. As I say, the classic QP is the lowest common denominator of query syntax and parsing and I would recommend to any user (Solr or not) that when they need to make a large syntactical change, that they roll their own parser.