shlomif / perl-XML-LibXML

The XML-LibXML CPAN Distribution for Processing XML using the libxml2 library
https://metacpan.org/release/XML-LibXML
Other
17 stars 35 forks source link

When we set option complete_attributes, it will also enable load_ext_dtd #47

Closed thibaultduponchelle closed 3 years ago

thibaultduponchelle commented 4 years ago

Hello,

First, THANK YOU a lot for maintaining this great module :heart:

I noticed that XML::LibXML is not setting load_ext_dtd implicitly when we set complete_attributes from both setter and option. It is not the same behavior than expand_entities nor xmllint --dtdattr (xmllint | grep dtdattr says --dtdattr : loaddtd + populate the tree with inherited attributes)

It means that XML::LibXML->load_xml(string => $xml, expand_entities => 1, complete_attributes => 1); will not do the same than XML::LibXML->load_xml(string => $xml, complete_attributes => 1);

As any other dtd related option, I think we want to set implicitly the load_ext_dtd option.

This pull request is doing this change and I provide a set of tests that will fail without the fix. In t/06elements.t, it is inhibited by the fact that another DTD related option is given.

In addition, I have the feeling that the documentation is not totally correct. For instance https://metacpan.org/pod/distribution/XML-LibXML/lib/XML/LibXML/Parser.pod#load_ext_dtd says Unless specified, XML::LibXML sets this option to 1 but I feel it's wrong.

Same with https://metacpan.org/pod/distribution/XML-LibXML/lib/XML/LibXML/Parser.pod#expand_entities says that default is 1 and I feel the contrary.

Concerning this last point about documentation, I just report my feelings, asking for your confirmation/thoughts.

For the rest, I hope this PR will fit your merge quality criteria.

Best regards and once again thank you for this amazing module.

Thibault

shlomif commented 4 years ago

Hi @thibaultduponchelle ! Thanks for the pull-req, and sorry it took me so long to reply. That put aside, I do not wish to implicitly enable load_ext_dtd on such cases due to the previously-patched security concerns. What do you think?

thibaultduponchelle commented 4 years ago

Hello,

I understand the security concerns about xxe injections and the disabling of expand_entities and load_ext_dtd but here I'm not going against security I think.

Concerning security, I see 4 differents threats (tell me if I miss something) :

I feel like load_dtd alone is not a threat

If we look at expand_entities (dangerous option), if we set expand_entities to 1, it sets to 1 load_ext_dtd because... it is a prerequisite to make expand_entities works fine : https://github.com/shlomif/perl-XML-LibXML/blob/master/LibXML.pm#L369 This line was added by "security fix PR" #39

Then if we look back to complete_attributes...

Could it enable something bad implicitly ?

To add one more argument xmllint is setting both options with --dtdattr exactly the behaviour that I'm trying to merge :)

If you are still not convinced, I think XML::LibXML should then be consistent and change the behaviour for expand_entities to also do not set load_ext_dtd :
https://github.com/shlomif/perl-XML-LibXML/blob/master/LibXML.pm#L369 https://github.com/shlomif/perl-XML-LibXML/blob/master/LibXML.pm#L632

But honestly I still think that it is a good idea to merge this PR :)

For me, it is perfectly fine to have the dangerous options off by default I'm totally ok with that, but when an user knows what he's doing and want to set dangerous options, XML::LibXML should do the job asked and set required set of options (not only half of them) to make it work.

Best regards.

Thank you a lot for this amazing module.

Thibault

thibaultduponchelle commented 3 years ago

Hello,

I realize that my previous comment is not that clear :hushed:

To make it short, I align on libxml and fix inconsistent behaviour.

What do you think @pali @timretout ? (as you were involved in security related changes recently)

What do you think @shlomif ?

Thank you all in advance.