fwdcloudsec / known_aws_accounts

List of known AWS accounts
Apache License 2.0
166 stars 25 forks source link

Add Github workflow to validate YAML schema #16

Closed sdemjanenko closed 1 year ago

sdemjanenko commented 1 year ago

This will allow us to ensure the YAML is well formatted in PRs. Note, we would like to make source a required field, but the YAML file currently fails this requirement.

Examples of failures:

When the source field is missing (NOTE: this is currently disabled because of violations)

 File `accounts.yaml` failed validation with >>>`'source' is a required property

Failed validating 'required' in schema['items']:
    {'additionalProperties': False,
     'properties': {'accounts': {'items': {'maxLength': 12,
                                           'minLength': 12,
                                           'type': 'string',
                                           'uniqueItems': True},
                                 'type': 'array'},
                    'name': {'type': 'string'},
                    'source': {'anyOf': [{'type': 'string'},
                                         {'items': {'type': 'string',
                                                    'uniqueItems': True},
                                          'type': 'array'}]},
                    'type': {'enum': ['aws']}},
     'required': ['name', 'source', 'accounts'],
     'type': 'object'}

When the accounts list is non-unique:

File `accounts.yaml` failed validation with >>>`['854209929931', '854209929931'] has non-unique elements

Failed validating 'uniqueItems' in schema['items']['properties']['accounts']:
    {'items': {'maxLength': 12, 'minLength': 12, 'type': 'string'},
     'type': 'array',
     'uniqueItems': True}
sdemjanenko commented 1 year ago

A record that is missing source is

- name: 'Azure Sentinel'
  accounts: ['197857026523']
0xdabbad00 commented 1 year ago

This is awesome! Thank you for doing this. Some of the older data was from folks I trust and know in real life that was sometimes provided me to privately. I want to keep those older references, so I might just cheat and do something like source: 'historical'.

0xdabbad00 commented 1 year ago

Is there any reason you chose that repo https://github.com/lyubick/action-YAML-schema-validator ? It looks new and not yet very popular. I usually tend to favor more widely used repos.

sdemjanenko commented 1 year ago

@0xdabbad00 I'm using that one because I didn't find another YAML validator that uses JSON schema. The source code of that action looks pretty reasonable and the author was very responsive to a bug-fix that I made in that action repo (he responded within 20 minutes on LinkedIn). The core logic of the action is in https://github.com/lyubick/action-YAML-schema-validator/blob/main/validator.py. It also has a pretty minimal set of requirements: https://github.com/lyubick/action-YAML-schema-validator/blob/main/requirements.txt

If you have a suggestion for a different action, I'm more than happy to switch to that. Its also possible to write a custom python script, but that might defeat the goal of using a common community implementation.

christophetd commented 1 year ago

Bumping this PR, being able to confidently merge/submit PRs without messing up the YAML would be great

ramimac commented 1 year ago

Thanks again for proposing this and starting the ball @sdemjanenko!