apache / incubator-stormcrawler

A scalable, mature and versatile web crawler based on Apache Storm
https://stormcrawler.apache.org/
Apache License 2.0
891 stars 260 forks source link

Bump org.netpreserve:jwarc from 0.30.0 to 0.31.0 #1416

Open dependabot[bot] opened 1 day ago

dependabot[bot] commented 1 day ago

Bumps org.netpreserve:jwarc from 0.30.0 to 0.31.0.

Release notes

Sourced from org.netpreserve:jwarc's releases.

v0.31.0

New features

  • Added optional support for brotli content encoding #88
  • Added HttpMessage.bodyDecoded() #88
  • WarcTool: Added dedupe subcommand
  • DedupeTool: Added --verbose option and silenced default logging

Bug fixes

  • GunzipChannel: Fixed incorrect record length calculation when gzip footer aligns with the end of the buffer
  • ValidateTool: Fixed digest validation #87
  • DedupeTool: Used matchType=exact to properly handle CDX queries for URLs ending with *
  • DedupeTool: Fixed record copying when transferTo copies fewer bytes than requested
  • DedupeTool: Prevented appending of an empty gzip member when no records were deduplicated
  • DedupeTool: Fixed exception when input files are in the current working directory
Commits
  • 14b80be Release 0.31.0
  • 9652eaa Merge branch 'gzip-position-fix'
  • e6d9e6c Merge pull request #88 from sebastian-nagel/utility-payload-decoder
  • f26a4d6 Merge pull request #87 from sebastian-nagel/validate-tool-does-not-validate-d...
  • 5c6171f Add optional support for brotli content encoding
  • b216255 Utility method to decode payload using the HTTP Content-Encoding header
  • 6444c81 ValidateTool: fix digest validation
  • f3abd7a DedupeTool: Create WarcWriter on demand
  • af5fb49 DedupeTool: Handle transferTo transferring fewer bytes than requested
  • 73d1408 DedupeTool: Handle paths in current directory
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)