-
**Describe the bug**
Certain HTML files scraped from GCP docs like the following URLs return empty elements or elements with simply newline characters when using `partition_html`.
**To Reproduce…
-
Using the wealth of unstructured.io‘s import formats would be a perfect fit to be added to private-gpt.
Would that be feasable/fit your plans?
Any hint welcome.
thx
r
-
Retention - Disposition is not working for unstructured data.
Applied Retention via Job Management (Table level) for Records containing **Blobs**. Generate purge job was successful. While running d…
-
![image](https://github.com/user-attachments/assets/4e2beedc-7fe4-4b86-a83b-c3406f8328a5)
Both code snippets Copied from README.md
```python
from megaparse import MegaParse
from langchain_openai…
-
### Is your feature request related to a problem?
Hello all. First, thanks for this amazing library!
I'm working with unstructured data ( points that are not in a grid ), and I want to interpola…
-
The provided datasets have four variants, each serving a specific purpose, and contain a `text_description` as described below E.g gov:
1. **syntheticDocQA_government_reports_test** – **No text_des…
-
**Describe the bug**
User gets a `TesseractError` when processing a particular document.
**To Reproduce**
Code was an API call with a certain image-based document.
**Expected behavior**
Docum…
-
### Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general).
- [X] I have s…
-
I have logs that are mostly json, but some logs come from system calls that can't be structured. It would be nice if I could set a rule that allowed me to capture the whole log and put it for example …
-
Hello,
Is there a way to install the library unstructured[pdf] in lightweight format just to use "fast" strategy without all other dependencies?
Thank you in advance for your support.