-
This should be discussed with the use of `make`. Pointed out by @alba-kth.
-
Hello,
Sometimes, documents that contains images are not recognized as text documents. For this issue, paperqa recommands to disable document checking.
`Could not read Auchy les mines (62) - Hai…
-
The use of `monospace` formatting, e.g. for mentions of `movement` in the documentation is inconsistent. When/where is monospace formatting required?
See also #266
-
This is inconsistent with other formats, like docx, where docwire includes such text. I have attached an example .docx and .rtf file, which contain the same content, but where docwire returns empty te…
-
When implementing #977, also rename the content files to `.txt`. The current extension, `.nwd` was mostly there to obscure the format, but it doesn't really do that (see #1960) and `.txt` is technical…
-
### Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general).
- [X] I hav…
-
Sometimes related documents will have the same text within them. For instance, two documents in a cluster may have the same text describing a header field or a code snippet. It would be helpful if the…
-
We currently save the binary blob to a sqlite database, which is a bit unwieldy. In most cases, we want to eventually save to PDF files anyhow. Essentially, we'll need to modify [`save_fulltext_from_d…
-
I get the following error when running NER: `TypeError: 'NoneType' object is not subscriptable`
After debugging the error, I found out that it is trying to access the document's `text` attribute, b…
-
The database schema generated through the migrations do not support FTS. The indexes must be created manually:
```sql
ALTER TABLE judge_problem ADD FULLTEXT(code, name, description);
```