-
I have downloaded `fa_tokenizer.pt` manually from the URL `https://www.dropbox.com/s/bajpn68bp11o78s/fa_ewt_tokenizer.pt?dl=1`. It's 636k in size. Its md5 is:
```
2097a125c5f85b36d569857bd60d51b7 f…
-
Para optimizar el pegado de texto en un archivo `README.md` en un entorno de GitHub utilizando un algoritmo de "recalculo cuántico textual", podemos usar herramientas de procesamiento de lenguaje natu…
-
Imagine if it allowed part and material names (`wheel` `head` `metal` `wood` etc) as standalone labels - without requiring an object :
- if you can browse free labelling you will find some example…
-
I am using the following code:
```
doc = nlp(text)
for token in doc:
if token.pos_ == 'PUNCT':
text = text.replace(token.text, '')
```
with the following raw text, read from a PDF…
-
I am trying to run
```
from deepsparse import TextGeneration
pipeline = TextGeneration(model="/mnt/d/mpt-7b-dolly_mpt_pretrain-pruned50_quantized/deployment")
prompt="""
Below is an instruct…
-
Machine Learning Roadmap for roadmap.sh:
1. Productivity and Learning Techniques:
- Read "Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones" by James Clear.
- Read "…
-
EDIT: It's clear from the feedback that I was very premature and wrong to say that Haystack isn't/can't be performant. Moreover, I appreciate the feedback that spaCy probably wouldn't move the needle …
-
Hi, where can I find documentation on usage of the pipelines in connection to specific tasks? E.g., I'm interesting in text preparation (cleaning, filtering, etc.) and alignment for parallel corpus cr…
-
**Is your feature request related to a problem? Please describe.**
When we do NER extraction, we use a databus in pub-sub mode. The motivation at the beginning of the project was to be able to run se…
-
### Is your feature request related to a problem? Please describe
# Problem Statements
Today, users can utilize `bulk` API to ingest multiple documents in a single request. All documents from this…