apache / skywalking

APM, Application Performance Monitoring System
https://skywalking.apache.org/
Apache License 2.0
23.86k stars 6.52k forks source link

[Feature] Optimize Index Definition for BanyanDB Module in OAP #11497

Closed hanahmily closed 1 month ago

hanahmily commented 1 year ago

Search before asking

Description

The current index definition for the BanyanDB module in OAP needs optimization to increase efficiency and improve performance. This issue focuses on moving different identifiers from indexRule to Entity in Measure and simplifying the network_address_alias entity.

Details:

  1. Common Metrics: Move service_id from indexRule to Entity in Measure. This change will streamline the process of accessing the service information by centralizing the identifiers into Entity within Measure.

  2. Relation Metrics: Move service_id, instance_id, and endpoint_id to Entity. By moving these identifiers to Entity, we can improve the retrieval of relation metrics and enhance the overall performance of the system.

  3. Network_address_alias: Retain only last_update_time_bucket. Simplifying the network_address_alias will reduce unnecessary complexity and improve efficiency.

The proposed changes are expected to optimize the index definitions, improve the system's performance, and simplify the overall structure of the BanyanDB module in OAP.

Use case

No response

Related issues

No response

Are you willing to submit a pull request to implement this on your own?

Code of Conduct

wu-sheng commented 1 year ago

So, this is only OAP side code change, and the protocol between OAP and DB is stable.

wu-sheng commented 11 months ago

@hanahmily I removed this from 9.7 as it seems not required in the version.

wu-sheng commented 7 months ago

According to the latest discussion, we need to make BanyanDB APIs and implementation support accept multiple series definitions, rather than one. Each series definition could have multiple fields. This could provide better query support in following typical scenarios

  1. Query metrics by ts and entity ID.
  2. Store all query relative fields in one file. Nowadays, some are in series ID files, and others are in the tags family.

FYI @hanahmily

hanahmily commented 1 month ago

1,2 are fixed by removing the indexed data from the data storage. 3 is moved to #12638