ingydotnet / yaml-libyaml-pm

Perl Binding to libyaml
http://search.cpan.org/dist/YAML-LibYAML/
33 stars 37 forks source link

Loaded self-referential structures have only one reference per referent #77

Open Kodiologist opened 6 years ago

Kodiologist commented 6 years ago

This issue ia a copy of https://rt.cpan.org/Public/Bug/Display.html?id=53278, originally reported 1 Jan 2010. The bug still exists as of YAML-LibYAML 0.69.

The problem is best explained with an example. This works as you would expect:

$ perl -e 'use YAML::XS; my $a = [["a"], ["b"]]; push @$a, $a->[0]; $a->[0] = "x"; print Dump($a), "\n";'
---
- x
- - b
- - a

But if you dump and reload $a between the push and the second assignment, you get this:

$ perl -e 'use YAML::XS; my $a = [["a"], ["b"]]; push @$a, $a->[0]; $a = Load Dump $a; $a->[0] = "x"; print Dump($a), "\n";'
---
- x
- - b
- x

Assigning to $a->[0] mysteriously changed $a->[2] as well. Sorta-kinda-workaround: YAML.pm doesn't have this bug. Also, recreating $a after loading it prevents the weirdness:

$ perl -e 'use YAML::XS; my $a = [["a"], ["b"]]; push @$a, $a->[0]; $a = Load Dump $a; $a = [@$a]; $a->[0] = "x"; print Dump($a), "\n";'
---
- x
- - b
- - a

(This is perl 5, version 26, subversion 0 (v5.26.0) built for x86_64-linux-gnu-thread-multi. Output of uname -a: Linux Xorn 4.13.0-21-generic #24-Ubuntu SMP Mon Dec 18 17:29:16 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux.)

perlpunk commented 6 years ago

thanks, confirmed. also, YAML::Syck and YAML::PP don't have this bug. when using Devel::Peek::Dump, I see a difference in the reference count, but no idea yet how to solve this.

perlpunk commented 6 years ago

This can also be reproduced by loading the following YAML:

---
- &1
  - a
- - b
- *1

What I also can see that the IV containing the RV has the same address for the alias and the anchor node. But only the RV should have the same address. Currently looking into load_alias() and learning about dereferencing nodes.

perlpunk commented 6 years ago

The reason for this behaviour is probably, that the nodes are treated like real aliases, so this also works for scalars:

use YAML::XS;
my $data = Load("{ a: &alias X, b: *alias }");
quote> say $data->{b};
quote> $data->{a} = "Y";
quote> say $data->{b};
__END__
X
Y

I fixed it by only modifying the behaviour for ARRAY and HASH references. For scalars like strings, but also regexes, it stays the same.

perlpunk commented 6 years ago

The question is, if the current behaviour is actually desired or not. The pure perl modules don't do real aliases at all, but it works for arrays and hashes because they are references. Having real aliases can also be seen as a feature.

Cc @ingydotnet

Kodiologist commented 4 years ago

This bug is officially a decade old. Happy tenth, bug.