manheim / manheim-c7n-tools

Manheim's Cloud Custodian (c7n) wrapper package, policy generator, runner, and supporting tools.
https://manheim-c7n-tools.readthedocs.io/
Apache License 2.0
45 stars 25 forks source link

Error when running the custodian step #21

Closed robertstettner closed 4 years ago

robertstettner commented 4 years ago

Hi there,

I have been trying to use your tool, but hit a problem.

[2020-04-02 12:00:19,474 INFO] Step 4 of 6 - custodian
[2020-04-02 12:00:19,483 INFO] Step custodian in REGION 1 of 4 (us-east-1)
[2020-04-02 12:00:20,740 INFO] Provisioning policy lambda ec2-auto-tag-user
[2020-04-02 12:00:22,201 ERROR] Error while executing policy
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/c7n/policy.py", line 575, in provision
    role=self.policy.options.assume_role)
  File "/usr/local/lib/python3.7/site-packages/c7n/mu.py", line 365, in publish
    func, role, s3_uri, qualifier=alias)
  File "/usr/local/lib/python3.7/site-packages/c7n/mu.py", line 484, in _create_or_update
    assert role, "Lambda function role must be specified"
AssertionError: Lambda function role must be specified
[2020-04-02 12:00:22,202 ERROR] Error while executing policy
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/c7n/policy.py", line 575, in provision
    role=self.policy.options.assume_role)
  File "/usr/local/lib/python3.7/site-packages/c7n/mu.py", line 365, in publish
    func, role, s3_uri, qualifier=alias)
  File "/usr/local/lib/python3.7/site-packages/c7n/mu.py", line 484, in _create_or_update
    assert role, "Lambda function role must be specified"
AssertionError: Lambda function role must be specified
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/c7n/commands.py", line 283, in run
    policy()
  File "/usr/local/lib/python3.7/site-packages/c7n/policy.py", line 1047, in __call__
    resources = mode.provision()
  File "/usr/local/lib/python3.7/site-packages/c7n/policy.py", line 575, in provision
    role=self.policy.options.assume_role)
  File "/usr/local/lib/python3.7/site-packages/c7n/mu.py", line 365, in publish
    func, role, s3_uri, qualifier=alias)
  File "/usr/local/lib/python3.7/site-packages/c7n/mu.py", line 484, in _create_or_update
    assert role, "Lambda function role must be specified"
AssertionError: Lambda function role must be specified

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/c7n/config.py", line 25, in __getattr__
    return self[k]
KeyError: 'debug'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/manheim-c7n-runner", line 11, in <module>
    load_entry_point('manheim-c7n-tools', 'console_scripts', 'manheim-c7n-runner')()
  File "/manheim_c7n_tools/manheim_c7n_tools/runner.py", line 622, in main
    args.ACTION, args.regions, step_names=args.steps, skip_steps=args.skip
  File "/manheim_c7n_tools/manheim_c7n_tools/runner.py", line 484, in run
    self._run_step_in_regions(action, step, regions)
  File "/manheim_c7n_tools/manheim_c7n_tools/runner.py", line 537, in _run_step_in_regions
    step(region_name, region_conf).run()
  File "/manheim_c7n_tools/manheim_c7n_tools/runner.py", line 242, in run
    run(conf)
  File "/usr/local/lib/python3.7/site-packages/c7n/commands.py", line 141, in _load_policies
    return f(options, list(policies))
  File "/usr/local/lib/python3.7/site-packages/c7n/commands.py", line 286, in run
    if options.debug:
  File "/usr/local/lib/python3.7/site-packages/c7n/config.py", line 27, in __getattr__
    raise AttributeError(k)
AttributeError: debug

I have both the assume_role and role_arn attributes set in the manheim-c7n-tools.yml config file.

The defaults.yml file has this set too:

mode:
  type: periodic
  ...
  role: '%%ROLE_ARN%%'

I am running this entrypoint using version 0.8.4: manheim-c7n-runner -S dryrun-diff -S docs dryrun my-account

jantman commented 4 years ago

Hmm... nothing is immediately jumping out at me here, @robertstettner. The two things that I can suggest are:

  1. The policygen step should write custodian_REGION.yml files to your current directory, containing the final interpolated policies that are fed into custodian. Perhaps that sheds some light on this problem?
  2. In order to do much debugging, I'd need to see an anonymized/scrubbed version of your manheim-c7n-tools.yml and also ideally the portion of custodian_us-east-1.yml for the ec2-auto-tag-user policy. If you don't feel comfortable posting them here I can provide alternate contact information, but in general (aside from any specifics that you have in your policies) they should be safe to post if you remove the account ID numbers (really, any 12-character string that's all digits).
robertstettner commented 4 years ago

Hi @jantman,

Here is the custodian_us-east-1.yml file:

policies:
- actions:
  - principal_id_tag: CreatorId
    tag: Created_by
    type: auto-tag-user
  filters:
  - tag:Created_by: absent
  mode:
    events:
    - RunInstances
    type: cloudtrail
  name: ec2-auto-tag-user
  resource: aws.ec2
- actions:
  - tag: Created_by
    type: auto-tag-user
  - type: set-bucket-encryption
  - enabled: true
    type: toggle-versioning
  - rules:
    - Filter:
        Prefix: /
      ID: company-s3-lifecycle
      NoncurrentVersionExpiration:
        NoncurrentDays: 35
      Status: Enabled
      Transitions:
      - Days: 180
        StorageClass: STANDARD_IA
    type: configure-lifecycle
  description: 'This policy is triggered when a new S3 bucket is created and it applies

    the AWS AES256 Default Bucket Encryption, Tags the creators ID, enables

    object versioning, and configures the bucket lifecycle.

    '
  mode:
    events:
    - CreateBucket
    timeout: 200
    type: cloudtrail
  name: s3-configure-standards-real-time
  resource: aws.s3

And, here is the manheim-c7n-tools.yml file (anonymised):

- account_name: my-account
  account_id: '123456789012'
  regions:
    - us-east-1
    - eu-west-1
    - eu-central-1
    - us-west-2
  assume_role:
    role_arn: &dev_role_arn arn:aws:iam::123456789012:role/c7n
  output_s3_bucket_name: c7n-123456789012-%%AWS_REGION%%
  custodian_log_group: /c7n/%%AWS_REGION%%
  dead_letter_queue_arn: &dev_dlq_arn arn:aws:sqs:%%AWS_REGION%%:123456789012:c7n-deadletter-queue
  role_arn: *dev_role_arn
  mailer_regions:
    - eu-west-1
  mailer_config:
    queue_url: https://sqs.eu-west-1.amazonaws.com/123456789012/c7n-queue
    role: *dev_role_arn
    from_address: our-team@example.com
    region: '%%AWS_REGION%%'
    contact_tags:
      - OwnerEmail
      - ownerEmail
      - owneremail
  cleanup_notify: []
robertstettner commented 4 years ago

I have just figured out what the problem was. I didn't RTFM when it comes to defaults and the policygen:

Note there is some special handling for the “mode” key: If the mode has a “type” of anything other than “periodic”, it will not be changed at all except by having “tags” updated iff it already has a “tags” key (even if that key has an empty value). As such, modes other than “periodic” must have their full configuration (except tags, which must be present but can be empty) specified in every policy.

Both my policies are using cloudtrail mode and not periodic, so that is why my lambdas didn't have a role. I just copy and pasted the defaults into those two policies. and boom. It's working.

jantman commented 4 years ago

Ahhh, ok, yeah. Apologies for that not being more clear... when policygen was originally written, the vast majority of our policies were periodic. I've opened #22 to catch this situation and fail policygen with a helpful error message; I'll try to get that in the next release, whenever that comes.

Apologies for the confusion and poor user experience around this, and I hope you're finding this project helpful.