turbot / steampipe-plugin-oci

Use SQL to instantly query Oracle Cloud resources across regions and accounts. Open source CLI. No DB required.
https://hub.steampipe.io/plugins/turbot/oci
Apache License 2.0
18 stars 5 forks source link

When executing more complex or advanced SELECT queries, Steampipe breaks. #617

Closed charles-josiah closed 3 months ago

charles-josiah commented 5 months ago

Describe the bug When executing more complex or advanced SELECT queries, Steampipe breaks.

Errors:

2024-06-05 13:30:41.205 UTC [WARN]  steampipe-plugin-oci.plugin: [WARN]  1717594209820: QueryData StreamError Get "https://iaas.sa-santiago-1.oraclecloud.com/20160918/instances?compartmentId=ocid1.compartment.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq&limit=1000": dial tcp 140.204.86.139:443: i/o timeout (oci-1717594186947)

2024-06-05 13:30:39.580 UTC [WARN]  steampipe-plugin-oci.plugin: [WARN]  1717594209820: QueryData StreamError Get "https://iaas.eu-zurich-1.oraclecloud.com/20160918/instances?compartmentId=ocid1.compartment.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq&limit=1000": dial tcp 140.204.16.201:443: i/o timeout (oci-1717594197144)

2024-06-05 13:30:39.581 UTC [WARN]  steampipe-plugin-oci.plugin: [WARN]  1717594209820: QueryData StreamError Get "https://iaas.ap-singapore-1.oraclecloud.com/20160918/instances?compartmentId=ocid1.compartment.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq&limit=1000": dial tcp 140.204.54.145:443: i/o timeout (oci-1717594198702)

2024-06-05 13:30:50.683 UTC [WARN]  steampipe-plugin-oci.plugin: [WARN]  1717594209820: callHydrateWithRetries returned error Get "https://iaas.ap-mumbai-1.oraclecloud.com/20160918/instances?compartmentId=ocid1.compartment.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq&limit=1000": dial tcp 140.204.4.160:443: i/o timeout

2024-06-05 13:30:50.683 UTC [WARN]  steampipe-plugin-oci.plugin: [WARN]  1717594209820: QueryData StreamError Get "https://iaas.ap-mumbai-1.oraclecloud.com/20160918/instances?compartmentId=ocid1.compartment.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq&limit=1000": dial tcp 140.204.4.160:443: i/o timeout (oci-17175941932423427144)

Error: oci: Get "https://iaas.eu-zurich-1.oraclecloud.com/20160918/instances?compartmentId=ocid1.compartment.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq&limit=1000": dial tcp: lookup iaas.eu-zurich-1.oraclecloud.com on 127.0.0.53:53: read udp 127.0.0.1:40737->127.0.0.53:53: i/o timeout (SQLSTATE HV000)

Steampipe version (steampipe -v) Steampipe v0.23.2

Plugin version (steampipe plugin list) hub.steampipe.io/plugins/turbot/aws@latest hub.steampipe.io/plugins/turbot/exec@latest hub.steampipe.io/plugins/turbot/net@latest hub.steampipe.io/plugins/turbot/oci@latest hub.steampipe.io/plugins/turbot/steampipe@latest

To reproduce

follow some reference sql:

select
  count(*) as value,
  'Unattached' as label,
  case
    count(*)
    when 0 then 'ok'
    else 'alert'
  end as type
from
  oci_core_volume
where
  id not in (
    select
      volume_id
    from
      oci_core_volume_attachment
  )
  and lifecycle_state <> 'TERMINATED';
select
  v.display_name as "Name",
  a.lifecycle_state as "Attachment Status",
  i.display_name as "Instance Name",
  t.title as "Tenancy",
  coalesce(c.title, 'root') as "Compartment",
  v.region as "Region",
  v.id as "OCID"
from
  oci_core_volume as v
  left join oci_core_volume_attachment as a on a.volume_id = v.id
  left join oci_core_instance as i on a.instance_id = i.id
  left join oci_identity_compartment as c on v.compartment_id = c.id
  left join oci_identity_tenancy as t on v.tenant_id = t.id
where
  v.lifecycle_state <> 'TERMINATED'
order by
  v.display_name;

Expected behavior SQL should execute without issues...

Additional context I tried other operating systems, including Windows, WSL, and Linux, but encountered the same issue. I also attempted to add some rate limits, but that was unsuccessful as well.

cat .steampipe/config/oci.spc

connection "oci" {
   plugin           = "oci"
   tenancy_ocid     = "ocid1.tenancy.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq"
   user_ocid        = "ocid1.user.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq"
   fingerprint      = "56:04:73:ad:2f:49:fc:0d:1d:4c:e5:7b:41:25:63:0b"
   private_key_path = "/home/myuser/.ssh/oracle_priv.pem"
   max_error_retry_attempts  =  12
   min_error_retry_delay = 35
   regions          = ["eu-amsterdam-1","ap-mumbai-1","me-dubai-1","eu-frankfurt-1","sa-saopaulo-1","ap-hyderabad-1","us-ashburn-1","me-jeddah-1","af-johannesburg-1","uk-london-1","ap-melbourne-1","us-chicago-1","us-phoenix-1","sa-santiago-1","ap-singapore-1","ap-sydney-1","eu-zurich-1"]
}
ParthaI commented 5 months ago

Hello @charles-josiah, I was having a look at this issue. I could not reproduce the error by running the query mentioned above. All the time I am getting the result as expected.

> select
  count(*) as value,
  'Unattached' as label,
  case
    count(*)
    when 0 then 'ok'
    else 'alert'
  end as type
from
  oci_core_volume
where
  id not in (
    select
      volume_id
    from
      oci_core_volume_attachment
  )
  and lifecycle_state <> 'TERMINATED';
+-------+------------+------+
| value | label      | type |
+-------+------------+------+
| 0     | Unattached | ok   |
+-------+------------+------+

Time: 3.1s. Rows returned: 1. Rows fetched: 3. Hydrate calls: 1. Scans: 2.
> select
  v.display_name as "Name",
  a.lifecycle_state as "Attachment Status",
  i.display_name as "Instance Name",
  t.title as "Tenancy",
  coalesce(c.title, 'root') as "Compartment",
  v.region as "Region",
  v.id as "OCID"
from
  oci_core_volume as v
  left join oci_core_volume_attachment as a on a.volume_id = v.id
  left join oci_core_instance as i on a.instance_id = i.id
  left join oci_identity_compartment as c on v.compartment_id = c.id
  left join oci_identity_tenancy as t on v.tenant_id = t.id
where
  v.lifecycle_state <> 'TERMINATED'
order by
  v.display_name;
+--------------------------------------------------------------+-------------------+------------------------+---------+-------------------------+-------------+-------------------------------------------------------------------------------------------+
| Name                                                         | Attachment Status | Instance Name          | Tenancy | Compartment             | Region      | OCID                                                                                      |
+--------------------------------------------------------------+-------------------+------------------------+---------+-------------------------+-------------+-------------------------------------------------------------------------------------------+
| abrg6ljrrjvtsj27e2vv2hg4hdnibhef66q535ai6b37mja2q45yh27qf6iq | ATTACHED          | instance20240606131518 | turbot  | test-compartment-kp-sub | ap-mumbai-1 | ocid1.volume.oc1.ap-mumbai-1.abrg6ljrrjvtsj27e2vv2hg4hdnibhef66q535ai6b37mja2q45yh27qf6iq |
+--------------------------------------------------------------+-------------------+------------------------+---------+-------------------------+-------------+-------------------------------------------------------------------------------------------+

Time: 4.0s. Rows returned: 1. Rows fetched: 11 (1 cached). Hydrate calls: 2. Scans: 5.

BTW: I am getting the NotAuthenticated/Forbidden error for the region where I don't have access or region not enabled.

Error: oci: Error returned by Blockstorage Service. Http Status Code: 401. Error Code: NotAuthenticated. Opc request id: 687f63641b8e90559fcd1ff659563d46/609ABF8730905061C462B8524A6E1059/58CC992D4ABBFB339F5EFF7A91803771. Message: The required information to complete authentication was not provided or was incorrect.
Operation Name: ListVolumes
Timestamp: 2024-06-06 14:21:32 +0000 GMT
Client Version: Oracle-GoSDK/65.28.0
Request Endpoint: GET https://iaas.sa-santiago-1.oraclecloud.com/20160918/volumes?compartmentId=ocid1.compartment.oc1..********llx724ekobolte***********yg6eqh4kmelr7mqq******&limit=1000
Troubleshooting Tips: See https://docs.oracle.com/iaas/Content/API/References/apierrors.htm#apierrors_401__401_notauthenticated for more information about resolving this error.
Also see https://docs.oracle.com/iaas/api/#/en/iaas/20160918/Volume/ListVolumes for details on this operation's requirements.
To get more info on the failing request, you can set OCI_GO_SDK_DEBUG env var to info or higher level to log the request/response details.
If you are unable to resolve this Blockstorage issue, please contact Oracle support and provide them this full error message.

The error you are experiencing (Error: oci: Get "https://iaas.eu-zurich-1.oraclecloud.com/20160918/instances?compartmentId=ocid1.compartment.oc1..aaasdjasdkljaslkdjalksjdsdfsdfsdf5jiq&limit=1000": dial tcp: lookup iaas.eu-zurich-1.oraclecloud.com on 127.0.0.53:53: read udp 127.0.0.1:40737->127.0.0.53:53: i/o timeout (SQLSTATE HV000) ) might be due to the following issues:

Could you please confirm the CLI command(oci compute instance list --compartment-id <your_compartment_id> --region eu-zurich-1 ) works fine for the region eu-zurich-1 and with the same set of credential profiles?

Thanks!

charles-josiah commented 5 months ago

DNS and networking are working correctly because when I run a local OCI-CLI command I get the expected results. To be sure, I tested it on three different computers using Windows and Linux operating systems.

I don't have any instances in eu-zurich-1.

 export OCI_CLI_REGION=eu-zurich-1
oci search resource structured-search --query-text "QUERY instance resources"
{
  "data": {
    "items": []
  }
}

Another sample using steampipe x oci-cli:

~$ steampipe query
Welcome to Steampipe v0.23.2
For more information, type .help
> select
  v.display_name as "Name",
  a.lifecycle_state as "Attachment Status",
  i.display_name as "Instance Name",
  t.title as "Tenancy",
  coalesce(c.title, 'root') as "Compartment",
  v.region as "Region",
  v.id as "OCID"
from
  oci_core_volume as v
  left join oci_core_volume_attachment as a on a.volume_id = v.id
  left join oci_core_instance as i on a.instance_id = i.id
  left join oci_identity_compartment as c on v.compartment_id = c.id
  left join oci_identity_tenancy as t on v.tenant_id = t.id
where
  v.lifecycle_state <> 'TERMINATED'
order by
  v.display_name;

Error: oci: Get "https://iaas.us-phoenix-1.oraclecloud.com/20160918/instances?compartmentId=ocid1.compartment.oc1..aklajsdlkajsdlkjalskdj&limit=1000": dial tcp: lookup iaas.us-phoenix-1.oraclecloud.com on 127.0.0.53:53: read udp 127.0.0.1:50161->127.0.0.53:53: i/o timeout (SQLSTATE HV000)

+------+-------------------+---------------+---------+-------------+--------+------+
| Name | Attachment Status | Instance Name | Tenancy | Compartment | Region | OCID |
+------+-------------------+---------------+---------+-------------+--------+------+
+------+-------------------+---------------+---------+-------------+--------+------+
> quit
Error: syntax error at or near "quit" (SQLSTATE 42601)
>

> select
  count(*) as value,
  'Unattached' as label,
  case
    count(*)
    when 0 then 'ok'
    else 'alert'
  end as type
from
  oci_core_volume
where
  id not in (
    select
      volume_id
    from
      oci_core_volume_attachment
  )
  and lifecycle_state <> 'TERMINATED';

Error: oci: Get "https://iaas.sa-santiago-1.oraclecloud.com/20160918/volumeAttachments?compartmentId=ocid1.compartment.oc1..sdfsdfsdfsdfsdfsdfsdfsdf&limit=1000": dial tcp: lookup iaas.sa-santiago-1.oraclecloud.com on 127.0.0.53:53: read udp 127.0.0.1:40638->127.0.0.53:53: i/o timeout (SQLSTATE HV000)

+-------+-------+------+
| value | label | type |
+-------+-------+------+
+-------+-------+------+
>

$
$ echo "using OCI-CLI"
$ oci search resource structured-search --query-text "QUERY instance resources"  --query 'data.items[*].identifier' --raw-output | jq -r '.[]' | wc -l
500

P.S.: I manage a large environment with over 4K Linux and Windows servers. In this environment, I have a significant number of disks.

ParthaI commented 5 months ago

Sure! Here's the revised sentence:


Hi @charles-josiah,

Attaching the Slack conversation here for reference: Slack Link.

The issue might be related to the large amount of data. I found a document that mentions the API limits:

In our case, it might be due to Bulk APIs or Payload size. According to the Slack conversation, the Steampipe rate limiter configuration should work as expected, but it doesn't seem to be working for you, right?

plugin "oci" {
  limiter "oci_global_concurrency" {
    max_concurrency = 20
  }
}

I have a couple of questions:

Could you please try the following query and let me know if it helps?

  1. Count Unattaached volumes
with volumes as (
  select
    id,
    lifecycle_state
  from
    oci_core_volume
),
volume_attachments as (
  select
    volume_id
  from
    oci_core_volume_attachment
),
unattached_volumes as (
  select
    id
  from
    volumes
  where
    id not in (select volume_id from volume_attachments)
    and lifecycle_state <> 'TERMINATED'
),
volume_count as (
  select
    count(*) as value
  from
    unattached_volumes
)
select
  value,
  'Unattached' as label,
  case
    value
    when 0 then 'ok'
    else 'alert'
  end as type
from
  volume_count;
  1. Instance attachment status
with volumes as (
  select
    id,
    display_name,
    lifecycle_state,
    compartment_id,
    tenant_id,
    region
  from
    oci_core_volume
),
volume_attachments as (
  select
    volume_id,
    lifecycle_state,
    instance_id
  from
    oci_core_volume_attachment
),
instances as (
  select
    id,
    display_name
  from
    oci_core_instance
),
compartments as (
  select
    id,
    title
  from
    oci_identity_compartment
),
tenancies as (
  select
    id,
    title
  from
    oci_identity_tenancy
)
select
  v.display_name as "Name",
  a.lifecycle_state as "Attachment Status",
  i.display_name as "Instance Name",
  t.title as "Tenancy",
  coalesce(c.title, 'root') as "Compartment",
  v.region as "Region",
  v.id as "OCID"
from
  volumes v
  left join volume_attachments a on a.volume_id = v.id
  left join instances i on a.instance_id = i.id
  left join compartments c on v.compartment_id = c.id
  left join tenancies t on v.tenant_id = t.id
where
  v.lifecycle_state <> 'TERMINATED'
order by
  v.display_name;
charles-josiah commented 5 months ago

Hi, about:

In our case, it might be due to Bulk APIs or Payload size. According to the Slack conversation, the Steampipe rate limiter configuration should work as expected, but it doesn't seem to be working for you, right?

plugin "oci" {
  limiter "oci_global_concurrency" {
    max_concurrency = 20
  }
}

I have a couple of questions:

Yes, it works without issues, as the API handles one query at a time. However, when I use Python, I need to limit the result to 100 lines. If the result exceeds this limit, it breaks.

If I could implement some kind of rate limiting, I would. I tested using the limit and region tags, but the situation remained the same. I believe that if some form of rate limiting could be applied, it would be very beneficial.

Result of sqls:

1 - Count Unattaached volumes

> with volumes as (
  select
    id,
    lifecycle_state
  from
    oci_core_volume
),
volume_attachments as (
  select
    volume_id
  from
    oci_core_volume_attachment
),
unattached_volumes as (
  select
    id
  from
    volumes
  where
    id not in (select volume_id from volume_attachments)
    and lifecycle_state <> 'TERMINATED'
),
volume_count as (
  select
    count(*) as value
  from
    unattached_volumes
)
select
  value,
  'Unattached' as label,
  case
    value
    when 0 then 'ok'
    else 'alert'
  end as type
from
  volume_count;

Error: oci: Get "https://iaas.us-ashburn-1.oraclecloud.com/20160918/volumeAttachments?compartmentId=ocid1.compartment.oc1..sdçlkfsdflçsdkfçlsdkfsdfsdfnsdf&limit=1000": dial tcp: lookup iaas.us-ashburn-1.oraclecloud.com on 127.0.0.53:53: read udp 127.0.0.1:35939->127.0.0.53:53: i/o timeout (SQLSTATE HV000)

+-------+-------+------+
| value | label | type |
+-------+-------+------+
+-------+-------+------+
>

2 - Instance attachment status

> with volumes as (
  select
    id,
    display_name,
    lifecycle_state,
> with volumes as (
  select
    id,
    display_name,
    lifecycle_state,
    compartment_id,
    tenant_id,
> with volumes as (
  select
    id,
    display_name,
    lifecycle_state,
    compartment_id,
    tenant_id,
    region
  from
    oci_core_volume
),
volume_attachments as (
> with volumes as (
  select
    id,
    display_name,
    lifecycle_state,
    compartment_id,
    tenant_id,
    region
  from
    oci_core_volume
),
volume_attachments as (
  select
    volume_id,
    lifecycle_state,
    instance_id
  from
    oci_core_volume_attachment
),
instances as (
  select
    id,
    display_name
  from
    oci_core_instance
),
compartments as (
  select
    id,
    title
  from
    oci_identity_compartment
),
tenancies as (
  select
    id,
    title
  from
    oci_identity_tenancy
)
select
  v.display_name as "Name",
  a.lifecycle_state as "Attachment Status",
  i.display_name as "Instance Name",
  t.title as "Tenancy",
  coalesce(c.title, 'root') as "Compartment",
  v.region as "Region",
  v.id as "OCID"
from
  volumes v
  left join volume_attachments a on a.volume_id = v.id
  left join instances i on a.instance_id = i.id
  left join compartments c on v.compartment_id = c.id
  left join tenancies t on v.tenant_id = t.id
where
  v.lifecycle_state <> 'TERMINATED'
order by
  v.display_name;

Error: oci: Get "https://iaas.me-dubai-1.oraclecloud.com/20160918/volumeAttachments?compartmentId=ocid1.compartment.oc1..sdfsdfsdjfskdfjksdfsldfkjsdfm&limit=1000": dial tcp: lookup iaas.me-dubai-1.oraclecloud.com on 127.0.0.53:53: read udp 127.0.0.1:37771->127.0.0.53:53: i/o timeout (SQLSTATE HV000)

+------+-------------------+---------------+---------+-------------+--------+------+
| Name | Attachment Status | Instance Name | Tenancy | Compartment | Region | OCID |
+------+-------------------+---------------+---------+-------------+--------+------+
+------+-------------------+---------------+---------+-------------+--------+------+
>

I'm here to help. If you need more tests or logs, please let me know.

ParthaI commented 5 months ago

Yes, it works without issues, as the API handles one query at a time.

Do both CLI and Steampipe work fine, correct?

However, when I use Python, I need to limit the result to 100 lines. If the result exceeds this limit, it breaks.

Note: Steampipe makes API calls based on the parallel numbers of:

So, the query plan matters in Steampipe.

If I could implement some kind of rate limiting, I would. I tested using the limit and region tags, but the situation remained the same. I believe that if some form of rate limiting could be applied, it would be very beneficial.

Could you please expand more about However, when I use Python, I need to limit the result to 100 lines. If the result exceeds this limit, it breaks.? So that we can manage the same scenarios in Steampipe for a larger environment as you have?

Thanks!

charles-josiah commented 5 months ago

A lot of questions :D

> select * from steampipe_plugin_limiter
+------------------------------+---------------------------------------------+-----------------+-------------+--------+>
| name                         | plugin                                      | plugin_instance | source_type | status |>
+------------------------------+---------------------------------------------+-----------------+-------------+--------+>
| exec_max_concurrency_limiter | hub.steampipe.io/plugins/turbot/exec@latest | exec            | config      | active |>
| oci_global_concurrency       | hub.steampipe.io/plugins/turbot/oci@latest  | oci             | config      | active |>
+------------------------------+---------------------------------------------+-----------------+-------------+--------+>
>
ParthaI commented 5 months ago

Hello @charles-josiah, Thank you for the information, in your case, the rate-limiter config is not working as expected. BTW the configuration looks good to me.

Additionally, would you be so kind as to provide further explanation on the following points mentioned in our previous conversation?

Could you please expand more about However, when I use Python, I need to limit the result to 100 lines. If the result exceeds this limit, it breaks.? So that we can manage the same scenarios in Steampipe for a larger environment as you have?

How are you managing the API calls in your Python script? Is it making the same bulk API calls as Steampipe does? For a single API call, we have a maximum limit set per page based on what the API supports in Steampipe. Let's expand our discussion to focus on a particular query.

select
  v.display_name as "Name",
  a.lifecycle_state as "Attachment Status",
  i.display_name as "Instance Name",
  t.title as "Tenancy",
  coalesce(c.title, 'root') as "Compartment",
  v.region as "Region",
  v.id as "OCID"
from
  oci_core_volume as v
  left join oci_core_volume_attachment as a on a.volume_id = v.id
  left join oci_core_instance as i on a.instance_id = i.id
  left join oci_identity_compartment as c on v.compartment_id = c.id
  left join oci_identity_tenancy as t on v.tenant_id = t.id
where
  v.lifecycle_state <> 'TERMINATED'
order by
  v.display_name;

Thanks!

misraved commented 3 months ago

Hi @charles-josiah,

I hope all is well with you. I'm writing to follow up on this issue. Since we haven't received any updates or feedback from you in a while, we'll be closing this issue for now.

If you still need assistance or if you have any further questions, please feel free to reopen the issue or create a new one. We're always here to help!

Thank you for your understanding.