ingydotnet / yaml-pm

YAML Perl Module
http://search.cpan.org/dist/YAML/
20 stars 27 forks source link

LoadFile blocks at 100% CPU for malformed YAML file in certain Perl versions #187

Closed crazytan closed 6 years ago

crazytan commented 6 years ago

I noticed this strange behavior when my colleague fed an invalid YAML file to my script:

a: "aaaaaaaaaaaaaaaaaaaaaaabbbb
c: d

The perl process took 100% CPU and it never exited. The relevant code looks like:

my $conf = [ LoadFile($file) ];

After some testing I found that it only happens on Perl versions before 5.18.4. For all the newer versions the function works as expected. This is a strange bug and it may lead to potential Denial-of-Service attack.

crazytan commented 6 years ago

Yeah I'm using YAML 1.24. Seems that perl gets stuck on https://github.com/ingydotnet/yaml-pm/blob/master/lib/YAML.pm#L91

perlpunk commented 6 years ago

The problem is the regex in this line: https://github.com/ingydotnet/yaml-pm/blob/master/lib/YAML/Loader.pm#L531 This has been problematic already in the past....

perlpunk commented 6 years ago

I believe this https://github.com/ingydotnet/yaml-pm/issues/186 is also related

crazytan commented 6 years ago

I'm not an experienced perl programmer.. but I doubt that is the cause. Because if I use Load instead of LoadFile, it seems to work regardless of what perl version I use:

my $conf = Load(<<'...');
a: "aaaaaaaaaaaaaaaaaaaaaaabbbb
c: d
...

which is very strange to me.

perlpunk commented 6 years ago

Yeah, I don't know why, but the line is exactly where it hangs when I try it with LoadFile and perl 5.16.3

crazytan commented 6 years ago

Ignore my comment then, sorry... Anyway I'm able to circumvent this issue by using Load everywhere.

perlpunk commented 6 years ago

This patch works for me. Could you try it out, too?

diff --git a/lib/YAML/Loader.pm b/lib/YAML/Loader.pm
index 1d45fa0..d0f44cd 100644
--- a/lib/YAML/Loader.pm
+++ b/lib/YAML/Loader.pm
@@ -527,13 +527,22 @@ sub _parse_inline_seq {
 sub _parse_inline_double_quoted {
     my $self = shift;
     my $node;
-    # https://rt.cpan.org/Public/Bug/Display.html?id=90593
-    if ($self->inline =~ /^"((?:(?:\\"|[^"]){0,32766}){0,32766})"\s*(.*)$/) {
-        $node = $1;
-        $self->inline($2);
-        $node =~ s/\\"/"/g;
-    }
-    else {
+    my $inline = $self->inline;
+    {
+        if ($inline =~ s/^"//) {
+            my $node = '';
+
+            while ($inline =~ s/^(\\.)// or $inline =~ s/^([^"\\]+)//) {
+                my $capture = $1;
+                $capture =~ s/\\"/"/g;
+                $node .= $capture;
+                last unless length $inline;
+            }
+            if ($inline =~ s/^"//) {
+                $self->inline($inline);
+                return $node;
+            }
+        }
         $self->die('YAML_PARSE_ERR_BAD_DOUBLE');
     }
     return $node;
crazytan commented 6 years ago

Works for me too. Thanks!

perlpunk commented 6 years ago

Cool, thanks!

perlpunk commented 6 years ago

Btw, you can reproduce it with Load if you decode the string before:

use Encode;
my $conf = decode_utf8 Load(<<'...');
a: "aaaaaaaaaaaaaaaaaaaaaaabbbb
c: d
...
crazytan commented 6 years ago

huh, it's encoding issue then. thanks for letting me know

perlpunk commented 6 years ago

Released YAML 1.24_001