open-policy-agent / opa

Open Policy Agent (OPA) is an open source, general-purpose policy engine.
https://www.openpolicyagent.org
Apache License 2.0
9.67k stars 1.34k forks source link

Feature Request: Support Line-delimited JSON #1644

Open jceresini opened 5 years ago

jceresini commented 5 years ago

It would be nice if OPA could read jsonl or ndjson (or any line-delimited json format) and present it as an array in the OPA data document.

Example use-case: We're using OPA to evaluate resources in google cloud. Google's Cloud Asset Inventry API supports exporting a list of google resources (including their metadata) in line-delimited json [1]. Right now we have a step in the middle where we reformat the output into a large json array. It would be great if I could load the export directly into OPA.

[1] https://cloud.google.com/resource-manager/docs/cloud-asset-inventory/reference/rest/v1beta1/folders/exportAssets

tsandall commented 5 years ago

@jceresini from the integration point of this would make things easier. Allowing users to load multi-document JSON or YAML files under non-root paths make sense (e.g., data.resources). Loading them at the room would be a bit trickier (data has to be an object today, input doesn't have that restriction).

Is this something you'd want to see on the command line, via the HTTP API, or via the Golang API (or some combination)?

jceresini commented 5 years ago

My current use case is command line via something like opa eval -d gcpassets:assets.ndjson -d policy/ --format values "data.compliance.checks" but that's more of a POC I'm working on.

Long term I'd probably want to move to using bundles, and throw the data into a tarball in a structure something like this:

$ tar -tf bundle.tgz
gcpassets/
gcpassets/data.ndjson

So both of those I believe would result in them being available in the non-root path ("data.gcpassets")

Note that I used the ndjson suffix, but I haven't actually looked into the difference between jsonl, ldjson, ndjson, and any other similar formats floating around out there.

hasit commented 5 years ago

Similarly, it would be nice to have support for multiple YAML streams inside one YAML file. To quote YAML 1.2 spec https://yaml.org/spec/1.2/spec.html#id2760395

YAML uses three dashes (“---”) to separate directives from document content. This also serves to signal the start of a document if no directives are present. Three dots ( “...”) indicate the end of a document without starting a new one, for use in communication channels.

For my current use-case, our pipelines creates k8s manifest files from custom manifest files, eventually spitting out helm charts. Some of those files have multiple documents separated by ---. Currently I am doing this in my CLI app (that uses OPA as a library) using https://github.com/kubernetes/apimachinery/blob/master/pkg/util/yaml/decoder.go.

Any thoughts on supporting this on the OPA side? Reading all the documents in a file and making them available as separate objects would be a desirable feature.

hjmallon commented 3 years ago

I would be interested in this too, but from an input json file with multiple objects (streams like jq is able to use, either newline delimeted or json-seq).

pogao commented 3 years ago

Did this go anywhere? I have the exact same problem with GCP's Cloud Asset API.

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

disfinder commented 1 year ago

still active

stale[bot] commented 1 year ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

disfinder commented 1 year ago

Still relevant

anderseknert commented 1 year ago

All issues not closed are considered potentially relevant, @disfinder. Describing your use case in detail is a better way to increase the chance of something getting worked on :)

stale[bot] commented 1 year ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.