Move OverrideDetector's functionality to the CompositeDetector (TIKA-3904).
The FileCommandDetector has been refactored to have the same
behavior as the Siegfried detector; see setUseMime in the javadoc (TIKA-3902).
Fix bug in OpenSearch emitter that prevented upserts on
documents with embedded files (TIKA-3882).
Extract PDF actions and triggers into the file's metadata (TIKA-3887).
Add a tika-async-cli module (TIKA-3885).
Fetch keys sent via headers to tika server are now URL decoded (TIKA-3864).
Release 2.5.0 - 09/30/2022
Improved extraction of PDF subset info for PDF/UA, PDF/VT, and PDF/X.
NOTE: we no longer append PDF/A information, e.g. 'version="A-1b"'
to the 'dc:format'. Users must now get that information from the
'pdfa:PDFVersion' key or from 'pdfaid:conformance'
and 'pdfaid:part' (TIKA-3844).
Avoid infinite loop in bookmark extraction from PDFs (TIKA-3832).
Upgraded to slf4j 2.0.1 (TIKA-3842).
Added upsert option for the OpenSearch emitter (TIKA-3855).
Extract PDF signature information at the document level
into the metadata (TIKA-3852).
Enable configuration of digests via AutoDetectParserConfig (TIKA-3853).
Use commons-io byte array streams via PJ Fanning (TIKA-3843).
Upgrade to PDFBox 2.0.27 (TIKA-3866).
Upgrade to JempBox 1.8.17 (TIKA-3856).
Add extraction of ODF version from ODF files (TIKA-3840).
tika-parser-html-commons (BoilerPipeHandler) is no longer a
a dependency of tika-parser-html-module. tika-app and tika-server-standard
have added a dependency on tika-parser-html-commons. However,
users who are managing custom dependencies and who want the BoilerPipeHandler
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps tika-core from 2.4.1 to 2.6.0.
Changelog
Sourced from tika-core's changelog.
... (truncated)
Commits
41319f3
[maven-release-plugin] prepare release 2.6.0-rc1aec8029
Binary incompatibility with updated maven release plugin, try to update scm a...d9040f4
Merge remote-tracking branch 'origin/main'89f0821
add release date, fix rat problems, update 2.5.1 -> 2.6.0 for next release cycle9911dd9
Merge pull request #784 from apache/dependabot/maven/org.apache.maven.plugins...71d6aca
Merge pull request #783 from apache/dependabot/maven/aws.version-1.12.334f6d80df
Bump maven-release-plugin from 3.0.0-M6 to 3.0.0-M702bd6f7
Bump aws.version from 1.12.333 to 1.12.334dfc99d6
Merge pull request #782 from apache/dependabot/maven/aws.version-1.12.3331a0d6ed
Bump aws.version from 1.12.332 to 1.12.333Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)