-
What a shame, this "wrapper" has been developed for mono-account (e.g: one laptop, one user), not for multi-accounts .... !
When used with multi-accounts, there's a permission issue (concurrency acce…
-
Related to a possible resolution to #143, we want to investigate if the SOLR 9 configuration in our sandbox environment may need additional optimization to work with a Hyrax application.
-
### Problem Description
The new AWS connector connects to S3 - people place standard data file types here,i.e., log.json, table.csv, and old.xml files.
Our current support types are for programing l…
-
### What to do
When converting PDF documents to txt with either apache tika or pdf2text we have some functionality to split the documents by passages afterwards. It would be beneficial to have per pa…
-
### Background
Since the launch of OpenSearch, there has been some great progress in documenting the project. So far, the Open Distro documentation has been updated and moved to OpenSearch, and there…
-
### Description
Version 1.7.1 is working fine. When I update to 1.8.0 everything is working fine for 5 minutes and then htop shows me that "python3 manage.py qcluster" is using more than 100% of cpu,…
-
### Problem Description
Right now we only extract a limited list of files. Let's use the ingest pipeline. Until Tika is available on edge (https://github.com/elastic/connectors-python/issues/167)
…
-
Hi guys, I think what you are doing is very interesting. I am currently struggling with data Preprocessing(Tutorial 8).
When I open my own pdf file in function PDFToTextConverter, I get the following…
-
Perhaps this isn't a bug, per se, but there seem to be various dependencies that are pinned to outdated versions.
For example, throughout the codebase you seem to be using Tika `1.24.1`, which was…
-
**Describe the bug**
There seems to be some issue with multiprocessing in Python and haystack.
If I import the multiprocessing library and **don't** import any haystack modules, I can run the fo…