nck-2 / test-rep

0 stars 0 forks source link

Review default index selection #1283

Closed githubmanticore closed 1 year ago

githubmanticore commented 1 year ago

In API we have a '*' as list of indexes, which just means 'all indexes'. That default is used when no indexes explicitly pointed, for example, as:

    var cl SphinxClient 
    foo, err := cl.Query("query") 

It may work ok if you have the only index; it even might work if you have several ones. But now we also have special indexes like 'pq', and they also part of 'all' indexes.

M.b. it have sense to review what is to do when we meet such unspecified '*' index selector - and either limit the list, either fire warning if some 'special' indexes (which doesn't support matching) are there.

githubmanticore commented 1 year ago

➤ Sergey Nikolaev commented:

Let's think.

Provided all the above I think what we should do is to:

It should be a compromise between confusingness (by results from multiple heterogeneous indexes) and ease of use.

@adriannuta What do you think?

githubmanticore commented 1 year ago

➤ Aleksey N. Vinogradov commented:

Consider also host with PQ-only indexes, where user wants full-scan (filtering) query.

githubmanticore commented 1 year ago

➤ Sergey Nikolaev commented:

host with PQ-only indexes, where user wants full-scan (filtering) query Still I think that resolving * to only one index + warning (in case there're more) makes more sense than automatic resolving to all indexes (even if they're of the same type). We have distributed indexes that solve take care of proper handling of multiple indexes when the user communicates with just one. Doing the same with *=all resolution is an anti-pattern I believe and we shouldn't encourage doing that. If you have just a single index - that's fine, just skip it in Query() and you'll be good and I find it the only valuable case here.

githubmanticore commented 1 year ago

➤ Adrian Nuta commented:

I will expect some users to 'cry' about * not searching anymore in all indexes (small users that use the SphinxAPI).

I say for now let's go to the easy route and resolve * to first index + warning if case and see the reaction. If we see people want a 'search in all' (elastic still has a _all) we can make * to resolve to either all non-PQ or only-PQ.

githubmanticore commented 1 year ago

➤ Sergey Nikolaev commented:

I will expect some users to 'cry' about * not searching anymore in all indexes (small users that use the SphinxAPI).

That's great, we'll then know who our users are.

I say for now let's go to the easy route and resolve to first index + warning if case and see the reaction. If we see people want a 'search in all' (elastic still has a _all) we can make to resolve to either all non-PQ or only-PQ.

OK, we'll see. That's not a priority for now BTW. Elastic is much more schemaless and can afford that. In our case if your indexes differ you will just get unpredictable results even if they're of the same type.

githubmanticore commented 1 year ago

➤ Aleksey N. Vinogradov commented:

Well, let's think that * is historically used as 'most wild' character in many contexts (starting from MS DOS filesystem patterns). So that using such wildchar implicitly mean 'most of the essencies'. We can limit it to only index in the list, well. But it will break such implicity. (if I want to use only the first index, I would expect something like, say, @first as the name. If I write * I usually mean 'everything')

Aside note: 'first from the list' - from what list? Where it come from? What is the order there?

githubmanticore commented 1 year ago

➤ Sergey Nikolaev commented:

using such wildchar implicitly mean 'most of the essencies'

good point. In this case what we can do is:

Any objections?

githubmanticore commented 1 year ago

➤ Aleksey N. Vinogradov commented:

Consider that client always send '*' if no names provided. So, the difference between absent and explicit '*' is only client's matter; daemon doesn't care about it.

So, changing behavior implies that such change has to be announced first. Otherwise any application which works with default behavior will be silently breaked by API library update - that is since the change is not about adding new functions which and old app doesn't know about and doesn't care, but about changing of behaviour of already existing API.

githubmanticore commented 1 year ago

➤ Sergey Nikolaev commented:

How to implement the solution is another story. Let's first come to an agreement on the design. @klirichek do you have any objections against the proposed design (* to be set explicitly to mean all, otherwise it means 1st index from the list)?