-
Parent: #2
----
- [x] Examine the data that we have whether they have all words in text-format or in scanned-image format.
- [x] Create an environment to process the data. on LRZ, Google Colab, l…
-
Hi Mangle Team,
My team and I are trying to deploy Mangle for use on a client engagement and ran a vulnerabilities check against the dependencies and executables packages. We have found numerous c…
-
Hi, great repository, I'm facing issues while installing the requirements.txt on my device. please help me fix this
![image](https://user-images.githubusercontent.com/48956758/93747193-fb5d5f00-fc13…
-
Automated tests by Python unittest
-
**What Renovate type are you using?**
Renovate CLI via renovate/renovate Docker image in Gitlab-CI
**Describe the bug**
We updated the docker image of our renovate bot from 14.x to 16.x yeste…
-
**Is your feature request related to a problem? Please describe.**
Limited file format supports, only txt, docx and pdf at this moment
**Describe the solution you'd like**
Support extracting text…
-
Running
```
git clone --recurse-submodules --remote-submodules https://github.com/opensemanticsearch/open-semantic-search.git
cd opensemanticsearch
docker-compose build
```
on latest Debian fai…
-
**Question**
I have few queries after checking out the colab notebook
i)If I have a set of documents(pdf,ppt,docx,xlsx).Is there any way i can use haystack ,to build query system.If so,can you please…
-
Hi,
Do we have support in the python-tika to extract pdf on page level? I want to deconstruct the big pdf into saparate pages and extract them saparately. Could it be done using Python-tika. In the n…
-
**Describe the bug**
I am following tutorial here https://paperless-ng.readthedocs.io/en/latest/setup.html#setup-bare-metal
(I have experience in linux and apache and python, but no prior to dja…