open-policy-agent / opa

Open Policy Agent (OPA) is an open source, general-purpose policy engine.
https://www.openpolicyagent.org
Apache License 2.0
9.33k stars 1.29k forks source link

Builtin yaml.is_valid returns valid for invalid YAML #6854

Open kishorviswanathan opened 4 days ago

kishorviswanathan commented 4 days ago

Short description

I am trying to use the yaml.is_valid function to validate cloud-init scripts of GCP ComputeInstances created via Config Connector. The function returns true for YAML that is not valid.

Steps to reproduce

Code:

package test_yaml

invalidYAML := `
  test:
test1: hello
`

test_validate_yaml := true {
  yaml.is_valid(invalidYAML)
  print(yaml.unmarshal(invalidYAML))
}

Output:

❯ opa test . -v
test_main.rego:
data.test_yaml.test_validate_yaml: PASS (289.419µs)

  {"test": null}

--------------------------------------------------------------------------------
PASS: 1/1

Additional information

OPA Version:

Version: 0.66.0
Build Commit: 85c2a8747627790bc7917dbfbf98e4e4bcb4d75e
Build Timestamp: 2024-06-27T14:38:58Z
Build Hostname: 
Go Version: go1.22.4
Platform: linux/amd64
WebAssembly: unavailable
anderseknert commented 4 days ago

Thanks for reporting that! It's interesting that both yaml.is_valid and yaml.unmarshal works. I'm guessing it's the YAML library trying to be lenient?

kishorviswanathan commented 1 day ago

@anderseknert Yes, Initially I thought it could be because we are not passing any struct to Unmarshal and the library is doing it's best to decode it. But I did a test by passing the proper struct and the library still parses the Yaml as valid. So it is probably an issue with the YAML library. Should I create an issue there ?

Test:

package main

import (
    "fmt"

    "sigs.k8s.io/yaml"
)

const (
    invalidYAML = `
  test:
test1: hello
`
)

type yamlType struct {
    Test  string `yaml:"test"`
    Test1 string `yaml:"test1"`
}

func main() {
    var data interface{}
    err := yaml.Unmarshal([]byte(invalidYAML), &data)
    if err != nil {
        fmt.Printf("Error %e", err)
    }

    fmt.Printf("data: %+v\n", data)

    var data1 yamlType
    err = yaml.Unmarshal([]byte(invalidYAML), &data1)
    if err != nil {
        fmt.Printf("Error %e", err)
    }
    fmt.Printf("data1: %+v\n", data)
}

Result:

data: map[test:<nil>]
data1: map[test:<nil>]
ashutosh-narkar commented 1 day ago

Should I create an issue there ?

I think so. Closing this issue. Feel free to report your findings later. Thanks.

anderseknert commented 1 day ago

If OPA has a built-in function called yaml.is_valid that returns true for YAML that isn't valid, that's a bug in OPA regardless of the origin of the bug.

I too would love to hear what you find out @kishorviswanathan, but even if this is a "wontfix" in another repo, and for whatever reasons they may have, I don't think we should use that to close valid issues reported here without even knowing what reasons those may be.

ashutosh-narkar commented 1 day ago

If OPA has a built-in function called yaml.is_valid that returns true for YAML that isn't valid, that's a bug in OPA regardless of the origin of the bug.

That's fair. I'll re-open the issue.

anderseknert commented 1 day ago

Thanks, Ash 👍