tika Search Results - Githubissues

1000+ results
for tika

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ukwa/webarchive-discovery #94

Teach Tika to spot Scots Gaelic

The library used by Tika already spots Welsh, but needs to be [taught](https://github.com/optimaize/language-detector#how-you-can-help) to spot [Scots Gaelic (gd)](https://en.wikipedia.org/wiki/Scotti…

anjackson updated 1 year ago
5
shelfio/tika-text-extract #68

Update got to latest please

# npm audit report got

tomcon updated 5 months ago
2
yomurb/yomu #18

Speed up bulk processing with Tika

Yomu is great. I'm currently using it to process thousands of documents. Unfortunately, this is very slow, because, right now, Yomu starts the JVM for each document. This takes about 2 seconds per doc…

jeremybmerrill updated 8 years ago
7
dewarim/cinnamon #178

Versioning does not update tika metaset

# Versioning does not update tika metaset ## Symptoms * When content is modified and the object is checked in as the same version, the tika metaset gets updated correctly. * When a new version is c…

boris-horner updated 6 years ago
1
nlmatics/nlm-ingestor #82

Cannot connect to the Docker daemon at unix:///run/user/8636…

Hi, I am getting the following error when trying to run the docker file ``` bash (dong) [conda] [lh599@corfu:nlm-ingestor]$ docker pull ghcr.io/nlmatics/nlm-ingestor:latest Cannot connect to the Do…

Tizzzzy updated 3 months ago
1
lando/solr #3

How do i install Apache Tika?

Might be good to create a simple guide that installs apache tika in a solr service with build steps

pirog updated 2 years ago
2
dust-tt/dust #6899

Allow upload of doc/pptx

File upload currently only allows text of pdf, we could use our tika parser to enable other upload types. As conversion would be done on the server, this would require adding a simple entrypoint to c…

tdraier updated 2 months ago
1
openstate/open-raadsinformatie #178

Replace or fix Tika server

In `ocd_backend.utils.file_parser` we use the python version of Apache Tika as a fallback when the mimetype is not 'application/pdf'. We use `pdfparser.poppler` as first choice since it has a native b…

jurrian updated 4 years ago
5
KevM/tikaondotnet #62

Configuring Tesseract OCR for TikaOnDotNet

The hope here is to get TikaOnDotNet fully configured to access Tesseract OCR for text extraction from images. With Tika .93 support for Tesseract was added, and we are now in the midst of validating…

LeeBear35 updated 5 years ago
8
digital-preservation/droid #717

DoS on quines

On Tika we've gathered two quines with their creators' permissions. One is a zip file that when unzipped is exactly the same file; the other is a gz file with the same behavior. I can't imagine DRO…

tballison updated 2 years ago
1

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for tika

1000+ results
for tika