pantoniou / libfyaml

Fully feature complete YAML parser and emitter, supporting the latest YAML spec and passing the full YAML testsuite.
MIT License
237 stars 71 forks source link

Prettify mode limits, questions, and plans #104

Open vaab opened 5 months ago

vaab commented 5 months ago

Hi, I've been using libyaml, yq, and libfyaml, and I noticed the existence of the pretty mode in libfyaml, which might not be exactly what I expected it was, so I wonder what are the intentions and responsibility of this mode. Let me clarify also my understanding to this point. I know there are many inconsistencies between tools, some are intentional, and others are bugs or functionality yet to be implemented.

Let's start with creating this yaml code:

cat <<EOF > /tmp/test.yaml
%YAML 1.1
%TAG !e! tag:example.com,2000:app/
---
!!map {
  ? !!str "sequence"
  : !!seq [ !e!bar "one", !!str "two", 1, 3.0, !!str "2010-01-01" ],
  ? !!str "mapping"
  : !!map {
    ? !!str "sky" : !myobj "blue",
    ? !!str "sea" : !!foo "green",
  },
}
EOF

Notice several points:

Here is the prettified version of this code using yq -P /tmp/test.yaml:

%YAML 1.1
sequence:
  - !<tag:example.com,2000:app/bar> one
  - two
  - 1
  - 3.0
  - "2010-01-01"
mapping:
  sky: !myobj blue
  sea: !!foo green

Here is the prettified version of this code using cat /tmp/test.yaml | shyaml get-value (this is a python wrapper around libyaml):

sequence:
- !<tag:example.com,2000:app/bar> one
- two
- 1
- 3.0
- '2010-01-01'
mapping:
  sky: !myobj blue
  sea: !!foo green

We notice that it is very similar, and that's encouraging. And they went both through:

Using fy-tool --dump -mode pretty /tmp/test.yaml:

%YAML 1.1
%TAG !e! tag:example.com,2000:app/
--- !!map
!!str "sequence": !!seq
- !e!bar "one"
- !!str "two"
- 1
- 3.0
- !!str "2010-01-01"
!!str "mapping": !!map
  !!str "sky": !myobj "blue"
  !!str "sea": !!foo "green"

We notice it is properly:

But it doesn't:

If keeping the tag directive doesn't sound problematic, as the YAML doc is still complete, it means that devs should not think that fy_emit_node_to_string(..) is complete by itself. They should not forget to use fy_emit_document_start(..) before.

Would you welcome some PR to move on some (or all) of these points ? Or want to share your stance on these topics ?

I already have some work done on these.

I'm sure also that I missed important information about these... so feel free to correct me.

vaab commented 5 months ago

And of course, many thanks for this new lib that is more than welcome !

vaab commented 5 months ago

Also, as a consequence of libfyaml not parsing for different types of scalar (AFAIK), there are no distinction made between an empty node and an empty stringed value... As food for though:

Libyaml:

$ echo "a: " | shyaml get-value -y a
null
$ echo "a: ''" | shyaml get-value -y a
''

yq:

$ echo "a: " | yq -r=false '.a'   ## humm, suprising answer

$ echo "a: " | yq '.a | type'
!!null
$ echo "" | yq '. | type'  ## but consistent
!!null
$ echo "a: ''" | yq -r=false '.a'
''
$ echo "a: ''" | yq '.a | type'
!!str
pantoniou commented 2 weeks ago

Yes, all these are a consequence of libfyaml currently only supporting the builtin schema (where all scalars are strings). There is work underway to support the default object schemas (and more) but it's not on master yet.

Leaving this open for now, and will revisit later