Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
7.8k stars 626 forks source link

Set `resolve_entities=False` by default in `lxml` parser for `partition_xml` #3078

Closed MthwRobinson closed 2 months ago

MthwRobinson commented 2 months ago

The goal of this issue is to update partition_xml so that resolve_entities=False by default to avoid being injected in the XML file from external files.

References