-
I used the local env and docker all showing flowing issue:
Response Content: {"total": 16, "offset": 0, "next": 10, "data": [{"paperId": "575506ce02dff846a9ec66a3c0cf69085228e435", "title": "Text D…
warys updated
2 months ago
-
Thank you for sharing your work. I have a question regarding the use of the WebNLG (v3.0) dataset in your work. In your paper, it is mentioned that the test data contains **1,165** text-triplet pairs.…
-
### Your current environment
```text
I'm running some tests to see if my use case would benefit from tokenizing my context before sending it in as I want to re-use my same context possibly hundred…
-
`Docling version: 2.3.1
Docling Core version: 2.3.1
Docling IBM Models version: 2.0.3
Docling Parse version: 2.0.2
`
Fine software: I have compared it with many alternatives. (BTW, for some new…
-
## Description
Followed the README instructions for installation, encountered a ValueError when processing repositories:
```bash
(.venv) ➜ Chat-with-Github-Repo (main) python3 src/main.py process …
-
We now have generated data for several data stages for which we can't compute manifests yet.
Hence, this issue aims at listing all the data stages for which new aggregating functions should be implem…
-
In two places — the cache update observer (#566) and in sync — we retrieve values from a committed transaction.
Those values include _references_ into the fulltext values table. These values have `…
-
### What happened + What you expected to happen
I want to use ray to do large-scale text data cleaning tasks, and extracted 5 million data from the open source redpajama github dataset for testing, w…
-
Running async in a web worker is a huge win for perceived performance, but there may be cases where running synchronously in the main thread would be preferable. Since [worker modules](https://github.…
-
This code "lyndon-factors" the first I know that tries to manipulate alphabets to change the number of factors.
I know this is aimed at biological sequential, but my application is text corpus and I …