shlomif / perl-XML-LibXML

The XML-LibXML CPAN Distribution for Processing XML using the libxml2 library
https://metacpan.org/release/XML-LibXML
Other
17 stars 35 forks source link

validation succeeds even though the DTD could not be loaded #71

Open vinc17fr opened 2 years ago

vinc17fr commented 2 years ago

The change #39 about load_ext_dtd introduced an unexpected issue, with possible security implications: when one sets validation to 1 without also setting load_ext_dtd to 1, the document is always regarded as valid.

It is probable that existing scripts that set validation to 1 do not explicitly set load_ext_dtd to 1, because 1 was the default and also because it is rather obvious that if the user wants validation, he also wants to load the DTD, which is needed for the validation. So this silently breaks validation. This may have security implications as validation can normally be used to check that input from untrusted source does not contain unexpected contents (e.g. contents that is likely to yield data injection).

See for instance: https://cwe.mitre.org/data/definitions/112.html

Example:

#!/usr/bin/env perl

# Update the xhtml directory.

use strict;
use XML::LibXML;

my $s = <<EOF;
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE root SYSTEM "does-not-exist.dtd">
<root/>
EOF

my $parser = XML::LibXML->new();
$parser->validation(1);
my $doc = $parser->parse_string($s);

Before the change of the load_ext_dtd default value, the fact that the DTD could not be loaded was properly reported, with a fatal error:

:2: I/O error : failed to load external entity "does-not-exist.dtd"
<!DOCTYPE root SYSTEM "does-not-exist.dtd">
                                           ^
:3: validity error : Validation failed: no DTD found !
<root/>
     ^