With user configuration, the PDFParser can now throw an EncryptedDocumentException
for Microsoft IRM PDF containers with encrypted payloads. Separately,
the PDFParser now throws an EncryptedDocumentException instead of an IOException
if the security handler cannot be found (TIKA-4082).
Changed default decompressConcatenated to true in CompressorParser.
Users may revert to legacy behavior via tika-config.xml (TIKA-4048).
Allow users to modify the attachment limit size in the /unpack resource (TIKA-4039)
Fixed write limit bug in RecursiveParserWrapper (TIKA-4055).
Add mime detection for many files with thanks to Gregory Lepore (TIKA-3992).
Release 2.8.0 - 5/11/2023
Enable counting and/or parsing of incremental updates in PDFs. This
is an experimental feature and may change in later releases (TIKA-4017).
Fixed bug that prevented the the loading of CompositeExternalParser in tika-app and
tika-server-standard. This parser will call exiftool and ffmpeg if those are installed, as was
the behavior in Tika 1.x. Exclude org.apache.tika.parser.external.CompositeExternalParser
if you do not want this behavior (TIKA-4022).
Removed the shading of tika-parsers-standard-module (TIKA-4038).
Enable optional extraction of file system metadata in FileSystemFetcher (TIKA-4035).
Allow pretty printing in FileSystemEmitter (TIKA-4034).
Add detection for and a new mime type for older postscript-based
Adobe Illustrator "application/illustrator+ps" files (TIKA-3971).
Add magic detection for canon raw file types: crw, cr2 and cr3 (TIKA-3991).
Add detection for ONIX message files (TIKA-4011).
Add detection and a parser for ActiveMime files (TIKA-3987).
Add extraction of rendition layout value and version from Epub (TIKA-4013).
Improve embedded file extraction from PDFs (TIKA-4012).
Improve metadata extraction from WARCs (TIKA-4018).
Update to PDFBox 2.0.28 (TIKA-4016).
Users may now avoid the ZeroByteFileException via a
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps org.apache.tika:tika-core from 2.5.0 to 2.8.0.
Changelog
Sourced from org.apache.tika:tika-core's changelog.
... (truncated)
Commits
656971f
[maven-release-plugin] prepare release 2.8.0-rc2fd27103
Update CHANGES.txt and rollback dev version for 2.8.0-rc2ef8c8ff
Remove shading oftika-parsers-standard-package
(#1130)6a93b54
Merge pull request #1127 from apache/dependabot/maven/test.containers.version...93d824a
Merge pull request #1128 from apache/dependabot/maven/com.google.cloud-google...49e5970
Bump google-cloud-storage from 2.22.1 to 2.22.24b6d797
Merge pull request #1129 from apache/dependabot/maven/aws.version-1.12.467fab540d
Bump aws.version from 1.12.466 to 1.12.467c12e825
Bump test.containers.version from 1.18.0 to 1.18.15323f9e
TIKA-4037 -- add detection for os2 bitmap arrays.Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)