23andMe / Yamale

A schema and validator for YAML.
MIT License
679 stars 88 forks source link

Yamale not able to match multiline or multiple nodes with regex given #183

Closed zendesk-mpalkur closed 2 years ago

zendesk-mpalkur commented 2 years ago

I am trying to validate my filters node with help of a regex pattern which can match multiple lines if it contains the given keyword in it. I checked regex validation for the text outside of yamale and it is working fine but when I pass the same regex onto yamale it is failing saying not a regex match. Regex pattern: "((.|\n))custodian-skip((.|\n))" Content used for Validation: filters:

mildebrandt commented 2 years ago

Hi, thanks for using Yamale!

I'm a little confused by your example. Your sample yaml doesn't have any multi-line text. Also, your regex validator line has unmatched parenthesis.

filter: 
        regex('((.|\n)*)custodian-skip((.|\n)*))')

When I fix the parenthesis and run it again, there are several errors:

Error validating data:
        policies.0.max-resources: Unexpected element
        policies.0.resource: Unexpected element
        policies.0.name: Unexpected element
        policies.0.actions: Unexpected element
        policies.0.filters: '[{'type': 'value', 'value_type': 'age', 'key': 'CreationDateTime', 'value': 30, 'op': 'gt'}, {'tag:custodian-skip': 'absent'}, {'and': [{'tag:Owner': 'absent'}, {'tag:owner': 'absent'}, {'tag:team': 'absent'}]}]' is not a regex match.

As you can see with the last error, you're trying to validate the string representation of a list of dictionaries with your regex. That's probably not what you want to do.

If you can provide an example of what you'd like to accomplish, I may be able to point you in the right direction.

zendesk-mpalkur commented 2 years ago

I want to check the regex pattern of custodian-skip across multiple lines under filter: and i need to get valid if custodian-skip exists under filter.

mildebrandt commented 2 years ago

Sorry, I still don't fully understand. Instead of explaining what you'd like to check with your regex, can you instead explain what's valid and what's not valid in your use case?

zendesk-mpalkur commented 2 years ago

I am looking for a regex pattern which matches all the content below the Filter: which contains a keyword custodian-skip in it. when i am trying the regex which is attached in the above getting valid in regex 101 but getting error here,

Also, how can i match all the lines below filter: with proper regex pattern as in the attached screenshot?

I am attaching a snippet what i am seeing in regex101.

Screen Shot 2021-11-30 at 11 56 58 AM
mildebrandt commented 2 years ago

So, you're still explaining about your regex. Forget about the regex, and tell me about your yaml and what makes a valid yaml in your use case.

mechie commented 2 years ago

Yamale does not have any validators that match against the raw yaml file/string--it loads the yaml into objects using pyyaml or ruamel.yaml and evaluates those against its validators. The regex validator is meant to validate a string value against the given expression, not a chunk of the yaml file.

What you're trying to do would be better accomplished by either writing a python script using re directly, or (as @mildebrandt is trying to show) write a Yamale schema that defines what you're seeking, e.g...

policies: list(include('custodian_policy'))
---
custodian_policy:
  filters:
    'tag:custodian-skip': str()

Though the above non-strict schema is certainly incorrect/incomplete given we don't know your definition of validity is, but you should get the idea.

zendesk-mpalkur commented 2 years ago

This is the schema.yaml file that we are using to validate our yaml files, where we added Filter:


policies:
---
policy:

  filters: include('filter', required=True)

filter: 
       regex('^custodian-skip')

Yaml files:

This is the structure of Yaml files with Filter: and other values, we want to check this structure and needs to validate this if it contains the skip tag.

  filters:
      - type: value
        value_type: age
        key: CreationDateTime
        value: 30
        op: gt
- "tag:custodian-skip": absent
      - and:
          - "tag:Owner": absent
          - "tag:owner": absent
          - "tag:team": absent
mildebrandt commented 2 years ago

What if the tag:custodian-skip is not present? Then it fails validation? You're saying that the tag:custodian-skip must always be in the filters section?

zendesk-mpalkur commented 2 years ago

What if the tag:custodian-skip is not present? Then it fails validation? You're saying that the tag:custodian-skip must always be in the filters section?

yes, we want tag:custodian-skip to be present in a filter.

mildebrandt commented 2 years ago

Thanks for answering my above questions. I'm going to try to frame the question into how Yamale works. In your attempt, you're trying to apply a string validator to a complex type (a list of mappings). Yamale instead validates against base types.

What I think you want in terms of how Yamale validates, it looks like you want to ensure that a certain element within a list exists...in your example, it's the mapping "tag:custodian-skip": absent. Currently Yamale doesn't support this use case. How I could see it being implemented is the following.

Given this yaml:

policies:
  - filters:
    - type: value
      value_type: age
      key: CreationDateTime
      value: 30
      op: gt
    - "tag:custodian-skip": absent
    - and:
      - "tag:Owner": absent
      - "tag:owner": absent
      - "tag:team": absent

Ensure that one of the filters has the mapping with the key "tag:custodian-skip":

policies: list(include('policy'))
---
policy:
  filters: list(include('filter', required=False), include('conditional', required=False), include('custodian-skip', required=True))

custodian-skip: 
  "tag:custodian-skip": str()

filter:
  type: str()
  value_type: str()
  key: str()
  value: int()
  op: str()

conditional:
  and: list()

What I'm describing above is that the required attribute of the validator controls if that item must exist in the list or is optional. This would change the definition of the current list() implementation and would require more discussion. This is just a suggestion at this point.

What you may want to do in the meantime is to create a custom validator. Then you can search your list of mappings for your tag in a way most appropriate for you. Here's the documentation for implementing custom validators: https://github.com/23andMe/Yamale#custom-validators

Reach out if you have any issues going down that route.