ruby / rexml

REXML is an XML toolkit for Ruby
BSD 2-Clause "Simplified" License
134 stars 62 forks source link

Add local entity expansion limit to `REXML::Parsers::StreamParser` #192

Closed otegami closed 2 weeks ago

otegami commented 1 month ago

Currently, the REXML allows changing the entity expansion text limit globally via REXML::Security.entity_expansion_text_limit. This global setting might unintentionally affect all parsing operations within the application, potentially introducing side effects in parts of the system where a lower limit is preferable for maintaining security.

Real-world Use Case

While processing a large XML dataset related to Wikipedia articles, we faced a situation where it was necessary to temporarily increase the entity expansion text limit for specific parsing operations involving large data elements. The requirement to adjust this limit globally, due to the global nature of the current setting, was not ideal.

ref; https://github.com/red-data-tools/red-datasets/pull/198

Proposed

I propose the introduction of an instance-specific method to set the entity expansion text limit directly on instances of REXML::Parsers::StreamParser. This method would allow developers to adjust the limit for individual parser instances, thus not impacting the global configuration.

parser = REXML::Parsers::StreamParser.new(entry.read, listener)
parser.entity_expansion_text_limit = 163_840
parser.parse

Adding this feature would provide the following benefits.

naitoh commented 3 weeks ago

@otegami Thanks for this proposed.

ref; https://github.com/red-data-tools/red-datasets/pull/198

BTW, I think the above case is caused by https://github.com/ruby/rexml/pull/195.

I think it was resolved in rexml 3.3.5.