UAlbertaALTLab / itwewina

Replaced by https://github.com/UAlbertaALTLab/cree-intelligent-dictionary
https://github.com/UAlbertaALTLab/cree-intelligent-dictionary
GNU General Public License v3.0
1 stars 0 forks source link

Update lxml to 4.3.3 #138

Closed pyup-bot closed 5 years ago

pyup-bot commented 5 years ago

This PR updates lxml from 3.4.2 to 4.3.3.

Changelog ### 4.3.3 ``` ================== Bugs fixed ---------- * Fix leak of output buffer and unclosed files in ``_XSLTResultTree.write_output()``. ``` ### 4.3.2 ``` ================== Bugs fixed ---------- * Crash in 4.3.1 when appending a child subtree with certain text nodes. Other changes ------------- * Built with Cython 0.29.6. ``` ### 4.3.1 ``` ================== Bugs fixed ---------- * LP1814522: Crash when appending a child subtree that contains unsubstituted entity references. Other changes ------------- * Built with Cython 0.29.5. ``` ### 4.3.0 ``` ================== Features added -------------- * The module ``lxml.sax`` is compiled using Cython in order to speed it up. * GH267: ``lxml.sax.ElementTreeProducer`` now preserves the namespace prefixes. If two prefixes point to the same URI, the first prefix in alphabetical order is used. Patch by Lennart Regebro. * Updated ISO-Schematron implementation to 2013 version (now MIT licensed) and the corresponding schema to the 2016 version (with optional "properties"). Other changes ------------- * GH270, GH271: Support for Python 2.6 and 3.3 was removed. Patch by hugovk. * The minimum dependency versions were raised to libxml2 2.9.2 and libxslt 1.1.27, which were released in 2014 and 2012 respectively. * Built with Cython 0.29.2. ``` ### 4.2.6 ``` ================== Bugs fixed ---------- * LP1799755: Fix a DeprecationWarning in Py3.7+. * Import warnings in Python 3.6+ were resolved. ``` ### 4.2.5 ``` ================== Bugs fixed ---------- * Javascript URLs that used URL escaping were not removed by the HTML cleaner. Security problem found by Omar Eissa. (CVE-2018-19787) ``` ### 4.2.4 ``` ================== Features added -------------- * GH259: Allow using ``pkg-config`` for build configuration. Patch by Patrick Griffis. Bugs fixed ---------- * LP1773749, GH268: Crash when moving an element to another document with ``Element.insert()``. Patch by Alexander Weggerle. ``` ### 4.2.3 ``` ================== Bugs fixed ---------- * Reverted GH265: lxml links against zlib as a shared library again. ``` ### 4.2.2 ``` ================== Bugs fixed ---------- * GH266: Fix sporadic crash during GC when parse-time schema validation is used and the parser participates in a reference cycle. Original patch by Julien Greard. * GH265: lxml no longer links against zlib as a shared library, only on static builds. Patch by Nehal J Wani. ``` ### 4.2.1 ``` ================== Bugs fixed ---------- * LP1755825: ``iterwalk()`` failed to return the 'start' event for the initial element if a tag selector is used. * LP1756314: Failure to import 4.2.0 into PyPy due to a missing library symbol. * LP1727864, GH258: Add "-isysroot" linker option on MacOS as needed by XCode 9. ``` ### 4.2.0 ``` ================== Features added -------------- * GH255: ``SelectElement.value`` returns more standard-compliant and browser-like defaults for non-multi-selects. If no option is selected, the value of the first option is returned (instead of None). If multiple options are selected, the value of the last one is returned (instead of that of the first one). If no options are present (not standard-compliant) ``SelectElement.value`` still returns ``None``. * GH261: The ``HTMLParser()`` now supports the ``huge_tree`` option. Patch by stranac. Bugs fixed ---------- * LP1551797: Some XSLT messages were not captured by the transform error log. * LP1737825: Crash at shutdown after an interrupted iterparse run with XMLSchema validation. Other changes ------------- ``` ### 4.1.1 ``` ================== * Rebuild with Cython 0.27.3 to improve support for Py3.7. ``` ### 4.1.0 ``` ================== Features added -------------- * ElementPath supports text predicates for current node, like "[.='text']". * ElementPath allows spaces in predicates. * Custom Element classes and XPath functions can now be registered with a decorator rather than explicit dict assignments. * Static Linux wheels are now built with link time optimisation (LTO) enabled. This should have a beneficial impact on the overall performance by providing a tighter compiler integration between lxml and libxml2/libxslt. Bugs fixed ---------- * LP1722776: Requesting non-Element objects like comments from a document with ``PythonElementClassLookup`` could fail with a TypeError. ``` ### 4.0.0 ``` ================== Features added -------------- * The ElementPath implementation is now compiled using Cython, which speeds up the ``.find*()`` methods quite significantly. * The modules ``lxml.builder``, ``lxml.html.diff`` and ``lxml.html.clean`` are also compiled using Cython in order to speed them up. * ``xmlfile()`` supports async coroutines using ``async with`` and ``await``. * ``iterwalk()`` has a new method ``skip_subtree()`` that prevents walking into the descendants of the current element. * ``RelaxNG.from_rnc_string()`` accepts a ``base_url`` argument to allow relative resource lookups. * The XSLT result object has a new method ``.write_output(file)`` that serialises output data into a file according to the ``<xsl:output>`` configuration. Bugs fixed ---------- * GH251: HTML comments were handled incorrectly by the soupparser. Patch by mozbugbox. * LP1654544: The html5parser no longer passes the ``useChardet`` option if the input is a Unicode string, unless explicitly requested. When parsing files, the default is to enable it when a URL or file path is passed (because the file is then opened in binary mode), and to disable it when reading from a file(-like) object. Note: This is a backwards incompatible change of the default configuration. If your code parses byte strings/streams and depends on character detection, please pass the option ``guess_charset=True`` explicitly, which already worked in older lxml versions. * LP1703810: ``etree.fromstring()`` failed to parse UTF-32 data with BOM. * LP1526522: Some RelaxNG errors were not reported in the error log. * LP1567526: Empty and plain text input raised a TypeError in soupparser. * LP1710429: Uninitialised variable usage in HTML diff. * LP1415643: The closing tags context manager in ``xmlfile()`` could continue to output end tags even after writing failed with an exception. * LP1465357: ``xmlfile.write()`` now accepts and ignores None as input argument. * Compilation under Py3.7-pre failed due to a modified function signature. Other changes ------------- * The main module source files were renamed from ``lxml.*.pyx`` to plain ``*.pyx`` (e.g. ``etree.pyx``) to simplify their handling in the build process. Care was taken to keep the old header files as fallbacks for code that compiles against the public C-API of lxml, but it might still be worth validating that third-party code does not notice this change. ``` ### 3.8.0 ``` ================== Features added -------------- * ``ElementTree.write()`` has a new option ``doctype`` that writes out a doctype string before the serialisation, in the same way as ``tostring()``. * GH220: ``xmlfile`` allows switching output methods at an element level. Patch by Burak Arslan. * LP1595781, GH240: added a PyCapsule Python API and C-level API for passing externally generated libxml2 documents into lxml. * GH244: error log entries have a new property ``path`` with an XPath expression (if known, None otherwise) that points to the tree element responsible for the error. Patch by Bob Kline. * The namespace prefix mapping that can be used in ElementPath now injects a default namespace when passing a None prefix. Bugs fixed ---------- * GH238: Character escapes were not hex-encoded in the ``xmlfile`` serialiser. Patch by matejcik. * GH229: fix for externally created XML documents. Patch by Theodore Dubois. * LP1665241, GH228: Form data handling in lxml.html no longer strips the option values specified in form attributes but only the text values. Patch by Ashish Kulkarni. * LP1551797: revert previous fix for XSLT error logging as it breaks multi-threaded XSLT processing. * LP1673355, GH233: ``fromstring()`` html5parser failed to parse byte strings. Other changes ------------- * The previously undocumented ``docstring`` option in ``ElementTree.write()`` produces a deprecation warning and will eventually be removed. ``` ### 3.7.4 ``` ================== Bugs fixed ---------- * LP1551797: revert previous fix for XSLT error logging as it breaks multi-threaded XSLT processing. * LP1673355, GH233: ``fromstring()`` html5parser failed to parse byte strings. ``` ### 3.7.3 ``` ================== Bugs fixed ---------- * GH218 was ineffective in Python 3. * GH222: ``lxml.html.submit_form()`` failed in Python 3. Patch by Jakub Wilk. ``` ### 3.7.2 ``` ================== * GH220: ``xmlfile`` allows switching output methods at an element level. Patch by Burak Arslan. Bugs fixed ---------- * Work around installation problems in recent Python 2.7 versions due to FTP download failures. * GH219: ``xmlfile.element()`` was not properly quoting attribute values. Patch by Burak Arslan. * GH218: ``xmlfile.element()`` was not properly escaping text content of script/style tags. Patch by Burak Arslan. ``` ### 3.7.1 ``` ================== * No source changes, issued only to solve problems with the binary packages released for 3.7.0. ``` ### 3.7.0 ``` ================== Features added -------------- * GH217: ``XMLSyntaxError`` now behaves more like its ``SyntaxError`` baseclass. Patch by Philipp A. * GH216: ``HTMLParser()`` now supports the same ``collect_ids`` parameter as ``XMLParser()``. Patch by Burak Arslan. * GH210: Allow specifying a serialisation method in ``xmlfile.write()``. Patch by Burak Arslan. * GH203: New option ``default_doctype`` in ``HTMLParser`` that allows disabling the automatic doctype creation. Patch by Shadab Zafar. * GH201: Calling the method ``.set('attrname')`` without value argument (or ``None``) on HTML elements creates an attribute without value that serialises like ``<div attrname></div>``. Patch by Daniel Holth. * GH197: Ignore form input fields in ``form_values()`` when they are marked as ``disabled`` in HTML. Patch by Kristian Klemon. Bugs fixed ---------- * GH206: File name and line number were missing from XSLT error messages. Patch by Marcus Brinkmann. Other changes ------------- * Log entries no longer allow anything but plain string objects as message text and file name. * ``zlib`` is included in the list of statically built libraries. ``` ### 3.6.4 ``` ================== * GH204, LP1614693: build fix for MacOS-X. ``` ### 3.6.3 ``` ================== * LP1614603: change linker flags to build multi-linux wheels ``` ### 3.6.2 ``` ================== * LP1614603: release without source changes to provide cleanly built Linux wheels ``` ### 3.6.1 ``` ================== Features added -------------- * GH180: Separate option ``inline_style`` for Cleaner that only removes ``style`` attributes instead of all styles. Patch by Christian Pedersen. * GH196: Windows build support for Python 3.5. Contribution by Maximilian Hils. Bugs fixed ---------- * GH199: Exclude ``file`` fields from ``FormElement.form_values`` (as browsers do). Patch by Tomas Divis. * GH198, LP1568167: Try to provide base URL from ``Resolver.resolve_string()``. Patch by Michael van Tellingen. * GH191: More accurate float serialisation in ``objectify.FloatElement``. Patch by Holger Joukl. * LP1551797: Repair XSLT error logging. Patch by Marcus Brinkmann. ``` ### 3.6.0 ``` ================== Features added -------------- * GH187: Now supports (only) version 5.x and later of PyPy. Patch by Armin Rigo. * GH181: Direct support for ``.rnc`` files in `RelaxNG()` if ``rnc2rng`` is installed. Patch by Dirkjan Ochtman. Bugs fixed ---------- * GH189: Static builds honour FTP proxy configurations when downloading the external libs. Patch by Youhei Sakurai. * GH186: Soupparser failed to process entities in Python 3.x. Patch by Duncan Morris. * GH185: Rare encoding related ``TypeError`` on import was fixed. Patch by Petr Demin. ``` ### 3.5.0 ``` ================== Bugs fixed ---------- * Unicode string results failed XPath queries in PyPy. * LP1497051: HTML target parser failed to terminate on exceptions and continued parsing instead. * Deprecated API usage in doctestcompare. ``` ### 3.5.0b1 ``` ==================== Features added -------------- * ``cleanup_namespaces()`` accepts a new argument ``keep_ns_prefixes`` that does not remove definitions of the provided prefix-namespace mapping from the tree. * ``cleanup_namespaces()`` accepts a new argument ``top_nsmap`` that moves definitions of the provided prefix-namespace mapping to the top of the tree. * LP1490451: ``Element`` objects gained a ``cssselect()`` method as known from ``lxml.html``. Patch by Simon Sapin. * API functions and methods behave and look more like Python functions, which allows introspection on them etc. One side effect to be aware of is that the functions now bind as methods when assigned to a class variable. A quick fix is to wrap them in ``staticmethod()`` (as for normal Python functions). * ISO-Schematron support gained an option ``error_finder`` that allows passing a filter function for picking validation errors from reports. * LP1243600: Elements in ``lxml.html`` gained a ``classes`` property that provides a set-like interface to the ``class`` attribute. Original patch by masklinn. * LP1341964: The soupparser now handles DOCTYPE declarations, comments and processing instructions outside of the root element. Patch by Olli Pottonen. * LP1421512: The ``docinfo`` of a tree was made editable to allow setting and removing the public ID and system ID of the DOCTYPE. Patch by Olli Pottonen. * LP1442427: More work-arounds for quirks and bugs in pypy and pypy3. * ``lxml.html.soupparser`` now uses BeautifulSoup version 4 instead of version 3 if available. Bugs fixed ---------- * Memory errors that occur during tree adaptations (e.g. moving subtrees to foreign documents) could leave the tree in a crash prone state. * Calling ``process_children()`` in an XSLT extension element without an ``output_parent`` argument failed with a ``TypeError``. Fix by Jens Tröger. * GH162: Image data in HTML ``data`` URLs is considered safe and no longer removed by ``lxml.html.clean`` JavaScript cleaner. * GH166: Static build could link libraries in wrong order. * GH172: Rely a bit more on libxml2 for encoding detection rather than rolling our own in some cases. Patch by Olli Pottonen. * GH159: Validity checks for names and string content were tightened to detect the use of illegal characters early. Patch by Olli Pottonen. * LP1421921: Comments/PIs before the DOCTYPE declaration were not serialised. Patch by Olli Pottonen. * LP659367: Some HTML DOCTYPE declarations were not serialised. Patch by Olli Pottonen. * LP1238503: lxml.doctestcompare is now consistent with stdlib's doctest in how it uses ``+`` and ``-`` to refer to unexpected and missing output. * Empty prefixes are explicitly rejected when a namespace mapping is used with ElementPath to avoid hiding bugs in user code. * Several problems with PyPy were fixed by switching to Cython 0.23. ``` ### 3.4.4 ``` ================== Bugs fixed ---------- * An ElementTree compatibility test added in lxml 3.4.3 that failed in Python 3.4+ was removed again. ``` ### 3.4.3 ``` ================== Bugs fixed ---------- * Expression cache in ElementPath was ignored. Fix by Changaco. * LP1426868: Passing a default namespace and a prefixed namespace mapping as nsmap into ``xmlfile.element()`` raised a ``TypeError``. * LP1421927: DOCTYPE system URLs were incorrectly quoted when containing double quotes. Patch by Olli Pottonen. * LP1419354: meta-redirect URLs were incorrectly processed by ``iterlinks()`` if preceded by whitespace. ```
Links - PyPI: https://pypi.org/project/lxml - Changelog: https://pyup.io/changelogs/lxml/ - Homepage: http://lxml.de/