opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
26 stars 33 forks source link

[BUG] The same log explorer used for Security Lake data source can’t query MV/CI indices #510

Open A-Gray-Cat opened 4 months ago

A-Gray-Cat commented 4 months ago

What is the bug? After a MV/CI index is created using the securitylake log explorer, if I try to query the created index within the same explorer, it will return index not found error for both SQL and PPL.

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Go to Data sources -> securitylake -> query data -> open log explorer
  2. Create a materialized view
    
    CREATE MATERIALIZED VIEW last_1day_ct_2024_07_31_mv AS
    SELECT time_dt,
        actor.user.uid AS requestor_arn,
        accountid AS account_id,
        region AS region,
        src_endpoint.ip AS source_ip,
        api.service.name AS service,
        api.operation AS api_operation,
        api.request.data AS request_parameters,
        api.response.data AS response_elements,
        api.response.error AS error,
        api.response.message AS response_message,
        http_request.user_agent AS user_agent
     FROM securitylake.amazon_security_lake_glue_db_us_east_1.amazon_security_lake_table_us_east_1_cloud_trail_mgmt_2_0
     WHERE time_dt BETWEEN CURRENT_TIMESTAMP - INTERVAL '1' DAY AND CURRENT_TIMESTAMP
    WITH ( auto_refresh = false)

REFRESH MATERIALIZED VIEW last_1day_ct_2024_07_31_mv


Run any query against the flint index you just created. E.g.

SELECT * FROM flint_securitylake_default_last_1day_ct_2024_07_31_mv


3. It will error out and say index not found.

**What is the expected behavior?**
Quey results returned.

**What is your host/environment?**
 - OS: [e.g. iOS]
 - Version 2.13
 - Plugins

**Do you have any screenshots?**
If applicable, add screenshots to help explain your problem.

**Do you have any additional context?**
Add any other context about the problem.
dblock commented 3 months ago

Catch All Triage - 1, 2, 3

dai-chen commented 3 months ago

CI (Covering Index) will be utilized automatically when querying source table, as discussed in https://github.com/opensearch-project/opensearch-spark/issues/298.