Closed githubmanticore closed 1 year ago
➤ Sergey Nikolaev commented:
Let's think.
*
user means only non-PQ indexes (i.e. we limit *
by only non-PQ indexes). In this case for that to work properly we should also make few more assumptions:
*
user means PQ + non-PQ indexes (i.e. we don't limit). I can think of only one use case of such behavior - user is making a dump to then make a backup or smth. That's definitely a wrong way of doing that that we shouldn't encourage. BTW now it's not working at all - the response has empty matches even though both RT and PQ indexes have some records Query("query")
makes sense. Provided all the above I think what we should do is to:
*
into the first index in the list It should be a compromise between confusingness (by results from multiple heterogeneous indexes) and ease of use.
@adriannuta What do you think?
➤ Aleksey N. Vinogradov commented:
Consider also host with PQ-only indexes, where user wants full-scan (filtering) query.
➤ Sergey Nikolaev commented:
host with PQ-only indexes, where user wants full-scan (filtering) query Still I think that resolving
*
to only one index + warning (in case there're more) makes more sense than automatic resolving to all indexes (even if they're of the same type). We have distributed indexes that solve take care of proper handling of multiple indexes when the user communicates with just one. Doing the same with*=all
resolution is an anti-pattern I believe and we shouldn't encourage doing that. If you have just a single index - that's fine, just skip it in Query() and you'll be good and I find it the only valuable case here.
➤ Adrian Nuta commented:
I will expect some users to 'cry' about *
not searching anymore in all indexes (small users that use the SphinxAPI).
I say for now let's go to the easy route and resolve *
to first index + warning if case and see the reaction. If we see people want a 'search in all' (elastic still has a _all
) we can make *
to resolve to either all non-PQ or only-PQ.
➤ Sergey Nikolaev commented:
I will expect some users to 'cry' about * not searching anymore in all indexes (small users that use the SphinxAPI).
That's great, we'll then know who our users are.
I say for now let's go to the easy route and resolve to first index + warning if case and see the reaction. If we see people want a 'search in all' (elastic still has a _all) we can make to resolve to either all non-PQ or only-PQ.
OK, we'll see. That's not a priority for now BTW. Elastic is much more schemaless and can afford that. In our case if your indexes differ you will just get unpredictable results even if they're of the same type.
➤ Aleksey N. Vinogradov commented:
Well, let's think that *
is historically used as 'most wild' character in many contexts (starting from MS DOS filesystem patterns). So that using such wildchar implicitly mean 'most of the essencies'.
We can limit it to only index in the list, well. But it will break such implicity. (if I want to use only the first index, I would expect something like, say, @first as the name. If I write *
I usually mean 'everything')
Aside note: 'first from the list' - from what list? Where it come from? What is the order there?
➤ Sergey Nikolaev commented:
using such wildchar implicitly mean 'most of the essencies'
good point. In this case what we can do is:
Query("...")
- *
should resolve to the 1st index from the list, no matter what that index will be since the assumption is that the user has only one index, if there're more than one in the list - return a warning Query("...", "*")
, i.e. when *
is explicitly specified by the user we can resolve it to all indexes in the list, but if the schema is not the same in all of them return a warning Any objections?
➤ Aleksey N. Vinogradov commented:
Consider that client always send '*' if no names provided. So, the difference between absent and explicit '*' is only client's matter; daemon doesn't care about it.
So, changing behavior implies that such change has to be announced first. Otherwise any application which works with default behavior will be silently breaked by API library update - that is since the change is not about adding new functions which and old app doesn't know about and doesn't care, but about changing of behaviour of already existing API.
➤ Sergey Nikolaev commented:
How to implement the solution is another story. Let's first come to an agreement on the design. @klirichek do you have any objections against the proposed design (*
to be set explicitly to mean all
, otherwise it means 1st index from the list)?
In API we have a '*' as list of indexes, which just means 'all indexes'. That default is used when no indexes explicitly pointed, for example, as:
It may work ok if you have the only index; it even might work if you have several ones. But now we also have special indexes like 'pq', and they also part of 'all' indexes.
M.b. it have sense to review what is to do when we meet such unspecified '*' index selector - and either limit the list, either fire warning if some 'special' indexes (which doesn't support matching) are there.