source-nerd / twitter-scraper

Python based twitter scraper
MIT License
6 stars 1 forks source link

Update soupsieve to 2.3 #177

Closed pyup-bot closed 2 years ago

pyup-bot commented 3 years ago

This PR updates soupsieve from 1.9 to 2.3.

Changelog ### 2.3 ``` - **NEW**: Officially support Python 3.10. - **NEW**: Add static typing. - **NEW**: `:has()`, `:is()`, and `:where()` now use use a forgiving selector list. While not as forgiving as CSS might be, it will forgive such things as empty sets and empty slots due to multiple consecutive commas, leading commas, or trailing commas. Essentially, these pseudo-classes will match all non-empty selectors and ignore empty ones. As the scraping environment is different than a browser environment, it was chosen not to aggressively forgive bad syntax and invalid features to ensure the user is alerted that their program may not perform as expected. - **NEW**: Add support to output a pretty print format of a compiled `SelectorList` for debug purposes. - **FIX**: Some small corner cases discovered with static typing. ``` ### 2.2.1 ``` - **FIX**: Fix an issue with namespaces when one of the keys is `self`. ``` ### 2.2 ``` - **NEW**: `:link` and `:any-link` no longer include `<link>` due to a change in the level 4 selector specification. This actually yields more sane results. - **FIX**: BeautifulSoup, when using `find`, is quite forgiving of odd types that a user may place in an element's attribute value. Soup Sieve will also now be more forgiving and attempt to match these unexpected values in a sane manner by normalizing them before compare. (212) ``` ### 2.1.0 ``` - **NEW**: Officially support Python 3.9. - **NEW**: Drop official support for Python 3.5. - **NEW**: In order to avoid conflicts with future CSS specification changes, non-standard pseudo classes will now start with the `:-soup-` prefix. As a consequence, `:contains()` will now be known as `:-soup-contains()`, though for a time the deprecated form of `:contains()` will still be allowed with a warning that users should migrate over to `:-soup-contains()`. - **NEW**: Added new non-standard pseudo class `:-soup-contains-own()` which operates similar to `:-soup-contains()` except that it only looks at text nodes directly associated with the currently scoped element and not its descendants. - **FIX**: Import `bs4` globally instead of in local functions as it appears there are no adverse affects due to circular imports as `bs4` does not immediately reference `soupsieve` functions and `soupsieve` does not immediately reference `bs4` functions. This should give a performance boost to functions that had previously included `bs4` locally. ``` ### 2.0.1 ``` - **FIX**: Remove unused code. ``` ### 2.0.0 ``` - **NEW**: `SelectorSyntaxError` is derived from `Exception` not `SyntaxError`. - **NEW**: Remove deprecated `comments` and `icomments` from the API. - **NEW**: Drop support for EOL Python versions (Python 2 and Python < 3.5). - **FIX**: Corner case with splitting namespace and tag name that have an escaped `|`. ``` ### 1.9.6 ``` **Note**: Last version for Python 2.7 - **FIX**: Prune dead code. - **FIX**: Corner case with splitting namespace and tag name that that have an escaped `|`. ``` ### 1.9.5 ``` - **FIX**: `:placeholder-shown` should not match if the element has content that overrides the placeholder. ``` ### 1.9.4 ``` - **FIX**: `:checked` rule was too strict with `option` elements. The specification for `:checked` does not require an `option` element to be under a `select` element. - **FIX**: Fix level 4 `:lang()` wildcard match handling with singletons. Implicit wildcard matching should not match any singleton. Explicit wildcard matching (`*` in the language range: `*-US`) is allowed to match singletons. ``` ### 1.9.3 ``` - **FIX**: `[attr!=value]` pattern was mistakenly using `:not([attr|=value])` logic instead of `:not([attr=value])`. - **FIX**: Remove undocumented `_QUIRKS` mode flag. Beautiful Soup was meant to use it to help with transition to Soup Sieve, but never released with it. Help with transition at this point is no longer needed. ``` ### 1.9.2 ``` - **FIX**: Shortcut last descendant calculation if possible for performance. - **FIX**: Fix issue where `Doctype` strings can be mistaken for a normal text node in some cases. - **FIX**: A top level tag is not a `:root` tag if it has sibling text nodes or tag nodes. This is an issue that mostly manifests when using `html.parser` as the parser will allow multiple root nodes. ``` ### 1.9.1 ``` - **FIX**: `:root`, `:contains()`, `:default`, `:indeterminate`, `:lang()`, and `:dir()` will properly account for HTML `iframe` elements in their logic when selecting or matching an element. Their logic will be restricted to the document for which the element under consideration applies. - **FIX**: HTML pseudo-classes will check that all key elements checked are in the XHTML namespace (HTML parsers that do not provide namespaces will assume the XHTML namespace). - **FIX**: Ensure that all pseudo-class names are case insensitive and allow CSS escapes. ```
Links - PyPI: https://pypi.org/project/soupsieve - Changelog: https://pyup.io/changelogs/soupsieve/ - Repo: https://github.com/facelessuser/soupsieve
pyup-bot commented 2 years ago

Closing this in favor of #179