Open deathjoin opened 2 years ago
Hey, can we get any updates on this? With 2.4.0 bug still persists.
Extra: it seems like the *
within index name broke the query:
POST /_plugins/_sql
{
"query": "SELECT * FROM test.data-1",
"fetch_size": 10
}
POST /_plugins/_sql
{
"query": "SELECT * FROM test.*",
"fetch_size": 10
}
It's kinda pain for us to use sql with latest updates because we splitting our indices by dates 😢
@deathjoin Sorry for the inconvenience. Just want to confirm, does your use case requires pagination (enabled by `fetch_size )? We're considering migrate pagination support to our engine V2. Could you elaborate your use case a little bit? Thanks!
Sure. Yes, pagination is required.
We use OpenSearch to store and continuously analyse many events from our product. Data extracted from OpenSearch using Python scripts via SQL API and then Pandas and other stuff involved to process it. Query can extract events for different periods of time like 1 day or 2 weeks and it sometimes leads to 100k or more documents per request, so we need pagination to get them all.
Typical index names are events-2022.11.20
. We manage our indices using ISM policies.
So to get data for time period we use either filter.range.timefield.lte/gte
or WHERE timefield<"time"
inside the query. Usually the query looks like SELECT fields FROM events-* WHERE timefield<"{date_time}" AND...
. It's the same script running every week so we don't want to specify full index names like SELECT fields FROM events-2022.11.10,events-2022.11.11,events-2022.12.... WHERE ...
which seems weird.
But queries like SELECT fields FROM events-* WHERE timefield<"{date_time}" AND...
don't work now.
We're considering migrate pagination support to our engine V2
@dai-chen Do you have any plans for this feature yet? Any dates? Although it's seems more like a bug to me than lack of feature because it worked well on 1.x.x version :)
SELECT * FROM test.* LIMIT 100000
. The PR https://github.com/opensearch-project/sql/pull/716 internally use scroll the pull all the matched docs. It is supported since OpenSearch 2.3.LIMIT 100000
only works for queries without aggregation. But it will help us for sure, thank you for the workaround!I was able to get around this at least for my purpose by increasing the opendistro.query.size_limit
@deathjoin What is the limit for queries with aggregation? My queries are returning maximum of 1000
size even when my opendistro.query.size_limit
and opendistro.sql.cursor.fetch_size
are set to 20000
Please track implementation progress in #1759
What is the bug? When querying data with
fetch_size
api responses withUnknown index
.How can one reproduce the bug? Steps to reproduce the behavior:
What is the expected behavior? Data returned
What is your host/environment?
Do you have any additional context?
Query was working on OpenSearch 1.3.0 and we just updated to 2.3.0
Only way I found query working is to run without
fetch_size
and other fields likefilter
, but it is not an option for us 😢docker logs error:
Cluster settings almost untouched
Cluster health
And shards