Open markomilivojevic opened 3 years ago
Please elaborate more on why it's important. Can't you just calculate count of $result->getFacets()
or did I get you wrong on what you mean by "total count of facets" ?
After experimenting with manticore json API, I realised that the issue belongs to manticoresearch
core library rather than php client.
To clarify issue:
jobs
can have 10M records, and can have 10k distinct employers.limit
and offset
, as described above.The code above will hit /json/search
endpoint, and it will produce two queries on manticore side:
SELECT * FROM jobs WHERE MATCH('{"bool":{"must":[{"match":{"employer_name":"amazon"}}]}}') LIMIT 0,0
SELECT employer_id FROM jobs WHERE MATCH('{"bool":{"must":[{"match":{"employer_name":"amazon"}}]}}') GROUP BY employer_id WITHIN GROUP ORDER BY weight() desc ORDER BY @groupby desc LIMIT 0,30
And behind the scene, it executes show meta
only for the first query, in order to get total number of found rows, which is useless in my case. What I need is show meta
from the second query which will return total number of found employers.
To answer your question: $result->getFacets()
returns limited number of facets, I set in variable $per_page
. Total count of facets needs to be returned by /json/search
endpoint.
OK, I think I see your point. Please review and confirm I understood it right:
So the minimal reproducible example is:
mysql -P9306 -h0 -e "drop table if exists t; create table t(f text, a int); insert into t(a) values(1),(2),(1)";
No "total" returned in "aggregations":
curl -sX POST http://localhost:9308/search -d '{"index":"t","limit":0,"offset":0,"aggs":{"a":{"terms":{"field":"a","size":10}}}}'|jq .
{
"took": 0,
"timed_out": false,
"hits": {
"total": 3,
"hits": []
},
"aggregations": {
"a": {
"buckets": [
{
"key": 2,
"doc_count": 1
},
{
"key": 1,
"doc_count": 2
}
]
}
}
}
(should be 2 in this case).
and when the number of facets is limited it's impossible to know how many can be returned in total:
curl -sX POST http://localhost:9308/search -d '{"index":"t","limit":0,"offset":0,"aggs":{"a":{"terms":{"field":"a","size":1}}}}'|jq .
{
"took": 0,
"timed_out": false,
"hits": {
"total": 3,
"hits": []
},
"aggregations": {
"a": {
"buckets": [
{
"key": 2,
"doc_count": 1
}
]
}
}
}
In SQL it can be seen in SHOW META
after select ... group by
:
mysql> select * from t group by a;
+---------------------+------+------+
| id | a | f |
+---------------------+------+------+
| 1514488646869188929 | 1 | |
| 1514488646869188930 | 2 | |
+---------------------+------+------+
2 rows in set (0.00 sec)
mysql> show meta;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| total | 2 |
| total_found | 2 |
| time | 0.000 |
+---------------+-------+
3 rows in set (0.00 sec)
Since the JSON protocol doesn't return it, the php client can't return it too.
Yes, you understood it correctly.
The library doesn't have a method to get total count of facets.
Here is the code example, imagine an index
jobs
with columnsemployer_id
andemployer_name
:So, adding a method similar to this one could solve the issue: