perlpunk / YAML-PP-p5

A YAML 1.2 processor in perl
https://metacpan.org/pod/YAML::PP
24 stars 8 forks source link

Booleans not as references #32

Closed bpj closed 4 years ago

bpj commented 4 years ago

Currently booleans in the data spawn YAML references so that for example the first time false occurs it says in_use: &1 false and every subsequent false becomes *1. It may be strictly speaking correct and you save 3 chars each time, but it's not very human readable. Is there a way to avoid this?

Note that I don't want to turn references off completely, just for booleans.

perlpunk commented 4 years ago

Yes, this is actually something I want to fix. JSON::PP and boolean.pm both reuse the same reference when returning true or false, so I have to make an exception for those modules. Thinking of implementing my own boolean module ;-)

bpj commented 4 years ago

I guessed that was the reason. I have been known to use Type::Tiny types which accept either a JSON::PP::Boolean object, a boolean.pm object or the digits 0 and 1. If/when my data didn't contain any *<number> substrings in their strings a couple of Vim commands would do the trick, but unfortunately I/we can't be sure that isn't the case anymore. Unfortunately all I can contribute ATM is a +1. I've looked at the code and I unfortunately haven't got the time to digest it, so if I tried to fix it I would probably break something else! :)

bpj commented 4 years ago

I can at least offer a reasonably safe regex for a workaround. It is safe as long as no block scalar contains a line which looks like a matching data line.

Customary warning: always backup/commit before doing this!

Warning: None of the versions below are safe for lines in block scalars which look like KEY: *NUMBER or - *NUMBER. We deem that our data don't contain any such lines but consider this!

And now the fun part:

We realized this morning that for most cases this Vim substitution does the right thing:

:%s#\v^\s*%(\w+\:|\-\s+)\zs\s*[*]2\s*$# false#

That is: any mapping or list value which consists only of *2 is replaced with false.

Naturally the 2 in the regex and the false in the replacement need to be adjusted for other reference numbers and/or the true case. Note that this substitution does some whitespace normalization on the side! :) Note also that as written this Vim regex finds only keys consisting of ASCII alphanumerics and underscores. In recent versions of Vim this will find a substring with any Unicode letters with or without combining marks, underscores or ASCII digits:

\v%(\Z[[:lower:][:upper:]0-9_]+)

Note that Vim's [:digit:] char class unfortunately isn't Unicode aware the same way [:upper:] and [:lower:] are.

I whipped together a Perl text filter for doing the same thing:

https://git.io/JfFId

Note: Don't delete your &2 false or whatever anchors, since there may be leftovers which this fix doesn't catch, however unlikely.

perlpunk commented 4 years ago

A fix 392e703b is in master and I hope to release 0.023 soon...

bpj commented 4 years ago

Meanwhile I came up with a less error prone fix which actually clones the booleans in the Perl data so that they aren't all references to the same object anymore.

package CloneBool;

use 5.014;
use utf8;
use strict;
use warnings;

our $VERSION = 0.001;

use Data::Rmap qw[rmap_to :types];
use Scalar::Util qw[blessed];

use Exporter::Shiny qw[ clone_bools ];

sub _clone_bool {
  my($v) = @_;
  my $class = blessed($v) || return $v;
  $class->isa('JSON::PP::Boolean') || return $v;
  my $copy = $$v;
  return bless \$copy => $class;
}

sub clone_bools {
  my($data) = @_;
  rmap_to {
    my $c = $_;
    my $ref = ref $c;
    if ( 'HASH' eq $ref ) {
      $c = +{ map {; $_ => _clone_bool $c->{$_} } keys %$c };
    }
    elsif ( 'ARRAY' eq $ref ) {
      $c = [ map {; _clone_bool $_ } @$c ];
    }
    $_ = $c;
  } HASH|ARRAY, $data;
  return $data;
}

1;

Usage:

#!/usr/bin/env perl

use 5.014;
use utf8;
use strict;
use warnings;

use YAML::PP;
use CloneBool qw[clone_bools];

my $ypp = YAML::PP->new(
  schema => ['JSON'],
  boolean => 'JSON::PP',
);

my $data = $ypp->load_file('data.yaml');

$data = clone_bools $data;

$ypp->dump_file('data1.yaml', $data);