TomWright / dasel

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
https://daseldocs.tomwright.me
MIT License
7k stars 130 forks source link

YAML anchors, aliases and references #285

Open electriquo opened 1 year ago

electriquo commented 1 year ago

Is your feature request related to a problem? Please describe. YAML supports anchors, aliases and references, but Dasel is unable to read it.

$ echo '
foo: &foo
  one: 1
bar:
  <<: *foo
  two: 2
' | yq -y
foo:
  one: 1
bar:
  one: 1
  two: 2

$ echo '
foo: &foo
  one: 1
bar:
  <<: *foo
  two: 2
' | dasel -r yaml

Describe the solution you'd like Dasel should be able to render the YAML correctly

TomWright commented 1 year ago

In my work on ordered maps I am writing custom logic around yaml encoding/decoding.

I expect I'll be able to resolve this issue when reading but writing could result in strange results - probably removal of the alias/merge and duplicate data

The reason behind this is that dasel, outside of encoding/decoding is unaware of these concepts.

I think I will purposely disallow this for now until I can come up with a proper solution to keep context throughout the lifecycle.

electriquo commented 1 year ago

@TomWright What do you think of solving the issue partially? i.e. implementing only the reading? This will allow to enjoy most of dasel features.

TomWright commented 1 year ago

Perhaps behind a flag. I don't want people to accidentally remove all aliases from their files

electriquo commented 1 year ago

I don't want people to accidentally remove all aliases from their files

I wish dasel would preserve in the output the aliases, anchors, references, order and comments from the input Currently, this is not the case so it might be to stick to the same convention that dasel output a static/rendered yaml output.

$ echo '
foo: foo
baz: baz
bar: &bar
  dasel: rules
cheese:
  <<: *bar
# comment
' | dasel -r yaml
bar:
  dasel: rules
baz: baz
cheese:
  dasel: rules
foo: foo

With that being said, I don't think dasel should care, since dasel is mostly used to transform/manipulate the input through pipes (same as in the snippet above which is inspired by the docs) — piping in and piping out.

TomWright commented 1 year ago

The trouble comes because dasel decodes all input data into a generic format, before processing the data and encoding back into the desired format.

This is how the cross-format transformations are possible.

The issue is that the generic format doesn't know about aliases, references etc right now. The ideal solution is to obviously make it aware and deal with it properly when re-encoding but that isn't an easy task. That is what I plan to do, but it will take time.

A requirement for this is that all data goes through some more in-depth decoding so I can actually read that information. The plus here is that the same is required for ordered maps and so some of the work will be done.

electriquo commented 1 year ago

Perhaps behind a flag

So from what I see, there is no need to place it behind a flag as it preserve the current behavior of dasel.

The ideal solution is to obviously make it aware and deal with it properly when re-encoding but that isn't an easy task.

By the way, I am unfamiliar with any tool that does this. If dasel ever support it, it will be differentiated from other tools due to this functionality. Yet, I am unsure what will how much this functionality will be used by users.

electriquo commented 1 year ago

@TomWright docker-compose is writing in go lang and has a support for anchors, aliases and references. It also does not suffer from things like #278, #294, #245, #293, etc.

Maybe it worth taking a look at docker-compose and learn from it.

TomWright commented 1 year ago

I am in the middle of reworking the yaml processing dasel does. A good bunch of those issues are solved with the rework, and some of them just aren't yaml processing issues.

All of the number formatting issues are due to the default output formatting of number values in golang, but I am thinking of solutions for those.

The fact that dasel parses into generic types massively complicates things in comparison to something like docker-compose that will be unmarshaling into known structs with expected types.

electriquo commented 1 year ago

@TomWright I switched to a tool of my own which consumes structured files (json, yaml, csv, etc.) and let you manipulate them easily. I did it by reading the structured files, converting them to json and then apply a given manipulation (selection/transformation). I did not have to come up with any specific regular language for the manipulation, as I used the builtin syntax and command of the programming language that I used.

I can elaborate more on that, but maybe dasel can benefit from the same approach. I have a feeling it will simplify things.

TomWright commented 1 year ago

@electriquo If you have a look to the code I'd be more than happy to take a look

electriquo commented 1 year ago

@TomWright the code is not polished to share it with the public. but let me share some short snippet

$ cat grabber.rb
#!/usr/bin/env ruby

require 'yaml'

file = ARGV[0]
query = ARGV[1]

data = File.read(file)
parsed = YAML.safe_load(data, aliases: true)
result = eval("#{parsed}#{query}")
puts(parsed.inspect)

$ grabber.rb test.yaml '.keys().map {|e| e.upcase}'
pmeier commented 2 months ago

Any update on this @TomWright? I second @electriquo in that being able to read yaml files that use anchors would already solve 90%+ of the use cases.

For me, I just use dasel for all the config files rather than having to memorize a bunch of tools. Most of the time, I just read data from the files and pass them along to some other program. But the fact that dasel cannot deal with anchors or the like in yaml files, means that I still need to have something like yq available.

TomWright commented 2 months ago

Honestly I'd forgotten about this, but thank you for the reminder. I'll see what I can do

pmeier commented 2 months ago

@TomWright any way I can help with this? If you can point me to the right parts of the source, I could try to send a PR.

TomWright commented 2 months ago

I do have a WIP locally but found the decoder/encoder may need some restructuring. I'm more than happy to accept PR's on the subject.

The code that needs to change is around these areas:

If you've got any questions please let me know. Apologies I haven't got this done yet, I've had a lot going on

pmeier commented 2 months ago

Thanks for the update! Some questions:

So for the first version I would propose dasel -f foo.yaml -w yaml converting

foo: &foo
  bar: 1
  baz: "baz"

spam:
  ham: "eggs"
  <<: *foo

into

foo:
  bar: 1
  baz: "baz"

spam:
  ham: "eggs"
  foo:
    bar: 1
    baz: "baz"

Basically just resolving the anchors instead of keeping track of them.

That would enable the read use case as I can then do something like

$ dasel -f foo.yaml '.spam.foo' -w json
{
  "bar": 1,
  "baz": "baz"
}
TomWright commented 2 months ago

Good question. Decoding only is a good first step.

We'll need to handle the new node type within the decoder, but all of the values end up being written to a reflect value so nothing outside of the decoder needs to be aware

TomWright commented 2 months ago

As of v2.8.0 dasel supports this feature when reading. Note that as of now, writes will de-reference the aliases.

I'm releasing as-is to unblock read use-cases - these never worked before anyway because of unhandled yaml tags so I'm not worried about breaking changes.

Issue will remain open as writes are not yet handled.

pmeier commented 2 months ago

Just for completion: my original patch was buggy. The fix and thus the proper reading behavior is only available with v2.8.1 onward.