tika Search Results - Githubissues

lucasdengcn/image-mirror #48

tika

apache/tika:latest apache/tika:latest-full

lucasdengcn updated 1 month ago

yobix-ai/extractous #33

Failed Extraction - cmap font missing

Hello! While trying to extract content from a PDF, I got the following error with very little information: - ParseError("Parse error occurred: Unable to extract PDF content"). After modifying th…

s4zuk3 updated 1 day ago

truenas/apps #837

Paperless-ngx - Add Tika and Gotenberg

Hi, I would like to see the Paperless-ngx add having additional two containers running, Tika and Gotenberg ([docs](https://docs.paperless-ngx.com/configuration/#optional-services)). I'm willing…

tina-junold updated 21 hours ago

mhughes2k/moodle-search_solrrag #2

Standalone Tika server config

If I select a standalone Tika server I don't see where to enter the config details on the settings. Should that be an option?

brianlmerritt updated 1 month ago

spring-projects/spring-ai #1476

Tika Document Reader issue (OpenAI and Mistral)

**Bug description** Tika Document Reader dependency causing response type exception. **Environment** Spring Boot version: 3.3.4 Spring AI Version: 1.0.0-M2 Java Version: OpenJDK 22 **Steps t…

kursatufukcoskun updated 4 days ago

yobix-ai/extractous #35

Support for Extracting PDF Content as XML

Hi, I’d like to use Extractous for my document processing tasks. I often need to extract PDF content as XML to retain structural information, such as page boundaries. This is a feature supported by Ap…

coroluca updated 1 day ago

Norconex/importer #121

Update dependency on Tika 1.27 to Tika 2.x

I mistakenly posted an issue on Collector about this problem; turns out that Collector is pulling in Importer as a transitive dependency which in turn pulls Tika 1.27; My application relies on Tik…

Dhanvanthri updated 1 month ago

ICIJ/datashare #1591

not indexing any PDF files (PDFStreamEngine stream of error)

**Describe the bug** added a trove of PDF file ... launch indexing ... get only Error writing: org.apache.tika.sax.TaggedSAXException: Error writing: org.xml.sax.SAXException: Error writing: ja…

jafooool updated 2 weeks ago

neuml/txtai #814

Investigate integration with Docling

[Docling](https://github.com/DS4SD/docling) looks like a promising text extraction library that could possibly augment or replace Apache Tika. **Update**: Docling added 3.9 support, this is a go! …

davidmezzetti updated 5 days ago

BenoitAnastay/paperless-home-assistant-addon #114

Tika Gotenberg

Hello and Thank you for your work, how can i use Tika and Gotenberg. I activate them in the config. Or is that not active? I have at least activated this in the config file. But I am not sure. b…

Sineiko updated 6 months ago

1000+ results for tika

1000+ results
for tika