mediacloud / metadata-lib

How Media Cloud approaches extracting metadata from online news stories
Apache License 2.0
12 stars 5 forks source link

Update trafilatura requirement from <1.7,>=1.4 to >=1.4,<1.8 #79

Closed dependabot[bot] closed 7 months ago

dependabot[bot] commented 9 months ago

Updates the requirements on trafilatura to permit the latest version.

Release notes

Sourced from trafilatura's releases.

trafilatura-1.7.0

Extraction:

  • improved html2txt() function (#483)

Downloads:

  • add advanced fetch_response() function → pending deprecation for fetch_url(decode=False)

Maintenance:

Changelog

Sourced from trafilatura's changelog.

1.7.0

Extraction:

  • improved html2txt() function

Downloads:

  • add advanced fetch_response() function → pending deprecation for fetch_url(decode=False)

Maintenance:

1.6.4

Maintenance:

  • MacOS: fix setup, update htmldate and add tests (#460)
  • drop invalid XML element attributes with @​vbarbaresi in #462
  • remove cyclic imports (#458)

Navigation:

  • introduce MAX_REDIRECTS config setting and fix urllib3 redirect handling by @​vbarbaresi in #461
  • improve feed detection (#457)

Documentation:

1.6.3

Extraction:

Metadata:

  • more precise date extraction (see htmldate)
  • new htmldate extensive search parameter in config (#434)
  • changes in URLs: normalization, trackers removed (see courlan)

Navigation:

  • reviewed code for feeds (#443)
  • new config option: external URLs for feeds/sitemaps (#441)

Documentation:

1.6.2

... (truncated)

Commits


Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
dependabot[bot] commented 7 months ago

Superseded by #84.