Closed ansjcy closed 2 months ago
:x: Gradle check result for 346eb169dd2dd31293f6ff5c496cc4dac1f41b46: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 1f3092e9569d92a156e4b3fb9760496e2bd3496d: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Checks if related components are compatible with change fba0556
Incompatible components: [https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/performance-analyzer.git]
Compatible components: [https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/sql.git]
:x: Gradle check result for c9caa80c9bdf5e0f4b103ecf05b080a2a5fefa4c: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Thanks @ansjcy for working on it. The whole code should be part of a plugin and not the server module. Is there anything you depend on here which isn't pluggable?
Thanks @ansjcy for working on it. The whole code should be part of a plugin and not the server module. Is there anything you depend on here which isn't pluggable?
Hey @rishabhmaurya ! Yes, this whole code will eventually be moved to a QueryInsight
plugin . The blocker right now is https://github.com/opensearch-project/OpenSearch/issues/11520.
I'm working on resolving that blocker right now in https://github.com/opensearch-project/OpenSearch/pull/11526
:x: Gradle check result for 0c64c4225e219b7706f5ca277da79bbd1f04aaee: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for b082ae1b9ff25bec3c2afe25787785d8687370d7: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for df64213f8694174fcc2214d46d0984552c324ea4: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 03a5166bccf6ba0c63d295b03fa6ba8cd10cf5e4: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 4d1cdc22d56e6fdf05196707e8180cac6836443c: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for ae0fd5d90b5ea38383e9dac2c1e3ce073ab284bd: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for a19c7896725ab384e082f71d5037727f0072436c: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for b2d864733230831ece8d0a09e7489f1591e24b77: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for ab2453bff0c4c95ab84a4e8c2645b78b41c0469f: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 93f63b36bb5d639529cb9d15799cfaf1d2bbb4d8: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for d69ca78c652ab6c0f520e55e49bd1aef48977203: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Had to submit partial review since I got a notification that the changes I am reviewing are stale (looks like you just added another commit). Still reviewing... cc @ansjcy
:white_check_mark: Gradle check result for 1f7af1a3de372ebe14632612bb674a36d285a58c: SUCCESS
Attention: Patch coverage is 73.57513%
with 102 lines
in your changes are missing coverage. Please review.
Project coverage is 71.39%. Comparing base (
a8dd6a0
) to head (fba0556
). Report is 96 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Awesome work @ansjcy and thanks for adding the exhaustive unit and integration tests. This was a really long PR to review. Excited to have this merged and in use 👍
Looks like the code coverage is failing for the tests. We might need to add some more tests to fix this?
:x: Gradle check result for 9480e7d89b19b5a3c1cdad4351d9443d9ef24d57: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for d60f9b64b0f0b128120ff360a8d849329e2f8d53: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for f1751799f47b87b6837189be4e01b91e15befd47: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Lets have a max/min allowed time for the window (1 minute to 1 hour based on our discussion in the meeting)?
If no type is provided for GET Top queries api, should we aim to provide topN for all types (Latency, CPU, etc) rather than defaulting to latency? Or for now we can default to latency since its the only one we have and tweak this in the long term when we add more topN?
@ansjcy - Maybe I am jumping in too late, but this single PR is doing many things. May I suggest breaking this into more reviewable chunks? I could see 3 major components:
The settings for window size and number of queries can go into either 2 or 3 PR.
If no type is provided for GET Top queries api, should we aim to provide topN for all types (Latency, CPU, etc) rather than defaulting to latency? Or for now we can default to latency since its the only one we have and tweak this in the long term when we add more topN?
We already have the default type as Latency defined in the current code. For the first phase, let's not change the scope and aim at Latency
as the first metric.
@ansjcy - Maybe I am jumping in too late, but this single PR is doing many things. May I suggest breaking this into more reviewable chunks? I could see 3 major components:
- Introduction of Query Insights plugin, along with the service
- Functionality of Top N latency tracking, along with the service
- API for viewing the top queries
The settings for window size and number of queries can go into either 2 or 3 PR.
Thanks for the feedback! Logically the Top N Latency Tracking service should go together with the API definations since they are tightly correlated. Let me see if I can break this PR into 2 to make it more reviewable.
:x: Gradle check result for b0fd5d2268ac75f016beadb6cb601b19dd3ae617: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for e70009d430d2b04ddf5165c5042b3e860e467e1a: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 9ec5266013b99f40ce39fa0e16ab5dfcebf7ac09: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:white_check_mark: Gradle check result for 1d68aab49ffcb22db3166b36721ad40f9518dfd7: SUCCESS
:x: Gradle check result for 620560f4f7e70fe0295e8f96eb9b7ce841342cc8: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for bb90f8afb928e067e004f9060a1264e0efec601f: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for cbccbc2d50787c5e8884f0ab4ade3cc61d8830d9: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 2784b679c177750d1dc67cc39a7739761791804e: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for e2c15ba955572e24839ff8dc868a337ce511615c: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 33fbda76f686f6bdcb8ddd90bfa707ea7e984957: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for a628089a55a7ff41330c741f6b23e8e01bbcf21b: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for b33ef50a97dcb3e1e460000dced77d085c8ade26: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for ddec7a2e8f0a5b2496221e7a2430406cc0400bce: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:grey_exclamation: Gradle check result for 5c4695575ebb5a6f98ad786013d9706e3f3f7cba: UNSTABLE
1 org.opensearch.search.SearchWeightedRoutingIT.testShardRoutingWithNetworkDisruption_FailOpenEnabled
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
:white_check_mark: Gradle check result for 46ce6e8434b36f60338fbde7c30a72ea82b7b386: SUCCESS
:white_check_mark: Gradle check result for 2cf290c84844def16d190dafb0648201700eadf2: SUCCESS
:x: Gradle check result for a2bbedf1022ba9e04d1a0a6c2f1f7ad209c242a6: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for c590dd53057e9c03d467567c71ab6b51a94feaae: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 80c3c570725f9089dc288d2560b74d30be9872b8: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for aa3cb94e7ddecae09b878e96e9146b024deba957: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 2e13df6cc44ec339ed8f3c86fe061abc5b08a776: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Description
(parent RFC: https://github.com/opensearch-project/OpenSearch/issues/11186) This PR implements the basic query insight framework and the "top N queries by latency" feature using this generic framework. More specifically, this PR includes:
How to use the API:
First enable the top N queries insight feature
Insert documents for searching
Do some search operations
Get top N queries by latency in the last 1 minute
returns
Load Tests
~70 Load tests are performed using the
nyc_taxis
workload on different combinations of window sizes and top n values. No performance impact identified. Here are detailed benchmark results.Feature off (Baseline)
n=10, window size = 10 minutes
n=50, window size = 10 minutes
n=100, window size = 10 minutes
n=10, window size = 60 minutes
n=50, window size = 60 minutes
n=100, window size = 60 minutes
Related Issues
Resolves #11295 #11296
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.