-
Is it not possible for `desync` to avoid modifying files that have no difference?
There is no blocks/chunks to update, yet the `mtime` is modified each time I run the `untar` command? (_`desync unt…
-
I am using the hi_res model locally and tried it both with and without chunking as well.
I also tried the chipper model via api, but faced similar issues as well.
**Major issues faced by us while …
-
```
Running 0.3.7 on a system that's extracted 90 discs correctly. Some SACDs will
read the table of content information, show the title information, start
extraction, but then lock up before showi…
-
[ ] I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug.
**Describe the bug**
ValueError: a cannot be empty unless no samples are taken
…
-
Hello!
I'd like to be able to predict tags when bulk uploading files.
A simple version could extract all text using PDFBox `PDFTextStripper` and match for only words without numbers (`[^\d\W]+`)…
-
Sentry Issue: [COURTLISTENER-6GQ](https://freelawproject.sentry.io/issues/4925662637/?referrer=github_integration)
```
****Error: Unable to extract content due to unknown extension, extracting text f…
-
_@mubaldino mentioned this in #18 but I thought I'd open a separate issue to have a more focused conversation on this particular feature_
Other tools, such as Tika, also extract metadata that is embe…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a…
-
How would it be possible to have a summary made out of a chunk? With a prompt named: `summary_gen.yaml`
**_EDIT: While I had trouble getting my head around the code in `./original`, I started from …
-
Hello,
Thank you so much for continuing the development of camelot! I'm glad to see that camelot continues to be maintained.
I happen to also manage a pdf extraction library, [gmft](https://git…