StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
5.97k stars 744 forks source link

v2.3 Running the "st2 key list" command returns 100 entries vs all #3708

Closed theuiz closed 6 years ago

theuiz commented 6 years ago

In v2.3 when running the "st2 key list" command, the result returns 100 entries vs all. The following was run in v2.1:

st2 key list -t xxxxxxx -j|grep "name"|wc -l
124

The following was run in v2.3:

st2 key list -t xxxxxxx -j|grep "name"|wc -l
100
Kami commented 6 years ago

Thanks for reporting this.

We made some changes to that command in v2.4.0 - https://github.com/StackStorm/st2/pull/3641 (print a note if there are more items and return 50 items by default).

You can change the limit (number of items returned at once) by passing -n argument to the command, but by default, the maximum value for limit is 100 items per page (this value can be increased in the st2.conf config).

@humblearner Because st2 key list command is also used to export all the values to backup them up or similar, we should modify it to also handle that scenario (return all the items, ignoring the limit) - https://docs.stackstorm.com/datastore.html?highlight=datastore.

We already have st2 key load command and I would propose to add st2 key export command which does that and leave st2 key list as it is now (maybe just bump default limit back to 100, but that's it).

humblearner commented 6 years ago

Please try:

$ st2 key list -j -n 0 | grep "name" | wc -l
210

Summary of changes:

And in your case (in v2.3), you see 100 for default key list, because: https://github.com/StackStorm/st2/blob/389ec41cfc580a1c1d798358d0133517fbb3fbed/st2common/st2common/openapi.yaml#L1220

In v2.4:

  1. For the all the list cli commands (rule, execution, triggerinstance, trace and rule_enforcement) default limit is 50(without -n flag) and max limit is 100 (with flag -n 0). To increase the max limit globally, you have to add max_page_size in st2.conf

  2. After PR for key list note: https://github.com/StackStorm/st2/pull/3641 For st2 key list default limit is 50.

    $ st2 key list -j| grep "name" | wc -l 
    50

    However with flag -n 0, it gets all the key value pairs (more than 100):

    $ st2 key list -n 0 | grep "st2kv.system" | wc -l
    210
  3. Also, the print note is not displayed with json/yaml flag. (-j or -y).

Hope this helps!

humblearner commented 6 years ago

@kami, for:

Because st2 key list command is also used to export all the values to backup them up or similar, we should modify it to also handle that scenario (return all the items, ignoring the limit)

I'll update the docs to include -n 0, to get all the keys, instead of adding new cli st2 key export.

As this is current behavior: Note limit param

$ st2 --debug key list 

2017-09-01 13:34:08,085  DEBUG - Using cached token from file "/home/vagrant/.st2/token-blah"
# -------- begin 139723469129872 request ----------
curl -X GET -H  'Connection: keep-alive' -H  'Accept-Encoding: gzip, deflate' -H  'Accept: */*' -H  'User-Agent: python-requests/2.14.2' -H  'X-Auth-Token: <token>' 'http://127.0.0.1:9101/v1/keys/?scope=system&decrypt=false&limit=50'

$ st2 --debug key list -n 0

2017-09-01 13:35:11,953  DEBUG - Using cached token from file "/home/vagrant/.st2/token-blah"
# -------- begin 140266911598736 request ----------
curl -X GET -H  'Connection: keep-alive' -H  'Accept-Encoding: gzip, deflate' -H  'Accept: */*' -H  'User-Agent: python-requests/2.14.2' -H  'X-Auth-Token: <token>' 'http://127.0.0.1:9101/v1/keys/?scope=system&decrypt=false&limit=0'

whereas for other cli with -n 0:

st2 --debug trace list -n 0 | less
2017-09-01 13:39:58,039  DEBUG - Using cached token from file "/home/vagrant/.st2/blah"
# -------- begin 139740524907024 request ----------
curl -X GET -H  'Connection: keep-alive' -H  'Accept-Encoding: gzip, deflate' -H  'Accept: */*' -H  'User-Agent: python-requests/2.14.2' -H  'X-Auth-Token: <token>' 'http://127.0.0.1:9101/v1/traces/?sort_desc=True'
# -------- begin 139740524907024 response ----------

st2 --debug trace list -n 0 | less
2017-09-01 13:40:40,985  DEBUG - Using cached token from file "/home/vagrant/.st2/token-blah"
# -------- begin 140378064389648 request ----------
curl -X GET -H  'Connection: keep-alive' -H  'Accept-Encoding: gzip, deflate' -H  'Accept: */*' -H  'User-Agent: python-requests/2.14.2' -H  'X-Auth-Token: <token>' http://127.0.0.1:9101/v1/rules/
# -------- begin 140378064389648 response ----------

Even though key list cli is inconsistent with other list cli with query param, it is right behavior to get all the keys. wdyt?

Kami commented 6 years ago

@humblearner I'm fine with that, but we just need to confirm it works as intended because with other API endpoints, if you specify limit=0, it should default to the default limit (100).

Edit: If it doesn't do that we have a bigger issue - the whole idea of the max limit is to prevent user to retrieve all the resources using a single API call to avoid resource exhaustion. If user can simply pass in limit=0 and get all the results back then we obviously have a bigger issue which needs to be fixed.

Kami commented 6 years ago

From Slack:

@humblearner https://github.com/StackStorm/st2/issues/3708#issuecomment-326933703 [1300] wait [13:00] this seems broken to me [13:01] aka we need to fix / change that [13:01] we have max limit there for a reason [13:01] if user can just pass ?limit=0 to skip pagination we need to fix that :smile: [13:01] should use default limit aka 100 if limit is not provided or limit is 0 [13:01] so to recap, i think we still need st2 key export command [13:01] which is behind rbac admin wall [13:01] and actually returns all the keys

arm4b commented 6 years ago

Talking about all this limit thing, I would question that.

Why querying the DB with 100+ records is something not allowed by default? (I'm aware about st2.conf setting). From the UX point of view IMO having that hard limit of 100 is not desired, - this is a nonsense to me and I don't see it in any other CLI (if we talk about the good ops-related tools).

If we care about the resources/slow response, then it's a user machine resources problem and possibly st2 performance problem, but we shouldn't set these parental control limits in any way.

My proposal would be to remove 100 hard limit at all, I see how it's not expected to me in day-to-day ops work and annoying thing for others (just check the Github history about the issues we had related to this).

@StackStorm/team what do you think about it?

arm4b commented 6 years ago

Also I think there should be a value for CLI to remove the limit.

--limit 0 as @humblearner noted in his findings looks good, in mistral CLI it's --limit -1:

optional arguments:
  -h, --help            show this help message and exit
  --filter FILTERS      Filters. Can be repeated.
  --limit [LIMIT]       Maximum number of tasks to return in a single result.
                        limit is set to 100 by default. Use --limit -1 to
                        fetch the full result set.
m4dcoder commented 6 years ago

I think it's reasonable to set a limit by default. If we don't set a limit for something like st2 execution list, it's going to bring down the server not to mention taking forever to load. However, user should have other options to retrieve all (it's their call) or retrieve certain pages. Since we want consistent behavior in API/CLI, all list/get_all functions should have the same options.

As for limit=0 vs limit=-1. I'm used to -1, from coding perspective to return all. limit=0 doesn't make any logical sense.

arm4b commented 6 years ago

limit=-1 looks good to me, except of additionally adding it to RBAC or further limiting via custom st2.conf settings.

arm4b commented 6 years ago

Commands like:

st2 execution list --status running --attr id | awk '{print $2}' | xargs st2 execution cancel

that was recently answered in #community https://stackstorm-community.slack.com/archives/C066APT88/p1504798377000217 as a solution to cancel a bunch of spammed executions is one of the examples.

For Linux CLI it's expected to have no limits by default so the CLI data same as logs could be grepped/piped and so on. I can understand that st2 could be slow in some places in context of outputting something like 3K actions (aws), - in that case I think it's an opportunity to profile and improve the overall system performance instead of creating further hard limits.

Here is an example for docker ps CLI which by default returns all possible results (via -1):

Name default Description
--last, -n -1 Show n last created containers (includes all states)

https://docs.docker.com/engine/reference/commandline/ps/#usage

with that output, users can do with data anything they want.

Mierdin commented 6 years ago

Yeah, to me, max_page_size is purely just about pagination - how many items in a given page. Should have nothing to do with the limit parameter, which is "how many do I want to see right now"? I agree that if there's no limit (i.e. limit=-1) then the CLI should continue to fetch pages (at whatever size is in the conf) until the dataset requested via the limit parameter is satisfied. In short, max_page_size should place no limitation on how high or low you the limit value can be.

arm4b commented 6 years ago

👍 as discussed in Slack the consensus is:

Discussions: