23andMe / Yamale

A schema and validator for YAML.
MIT License
679 stars 88 forks source link

'True' is not a regex match. #155

Closed hazedav closed 3 years ago

hazedav commented 3 years ago

As a yaml schema validator I wanted to ensure that the value of myfield is set to true. I am attempting to achieve this via regex() and not bool() because bool() allows both truth and non-truthy values.

So I set up a yamale schema validation file true_schema.yaml:

myfield: regex('^[Tt]rue$', ignore_case=True, required=True)

An a true_test.yaml:

myfield: true

When running yamale:

❯ yamale --schema true_schema.yaml true_test.yaml
Validating true_test.yaml...
Validation failed!
Error validating data 'true_test.yaml' with schema 'true_schema.yaml'
        myfield: 'True' is not a regex match.

Two things seem strange:

  1. Yamale is interpreting true as True. This isn't a deal breaker as I can easily make the regex() case insensitive with [Tt] and/or ignore_case=True.
  2. Yamale is not able to validate the value of True as a regex match. I have even tried regex of ^.*$...

Please advise.

mechie commented 3 years ago

In YAML, true is a boolean, not a string (see the definition of bool in the YAML spec):

Regexp: y|Y|yes|Yes|YES|n|N|no|No|NO |true|True|TRUE|false|False|FALSE |on|On|ON|off|Off|OFF

All of the above will be loaded into python as a boolean and, thus, fail any sort of regex (string) test. Yamale sits behind a yaml-to-python parser (pyyaml or ruamel.yaml), and does not know what the truth-y value looked like.

Instead, you could use enum() with a single option:

myfield: enum(True)

That should allow only documents that have myfield set to a truth-y value.

In case it's useful in the future, you can force a bool-like string to be interpreted as a string by using single or double quotes (see 7.3. Flow Scalar Styles in the YAML spec), like so:

myfield: 'true'

I hope this helps!

hazedav commented 3 years ago

Makes perfect sense.

mildebrandt commented 3 years ago

Thanks for helping @mechie!