-
I want to try to train GROBID to extract metadata from Arabic articles. I tried to generate training data as described in the documentation. However, I assume since the model had not previously seen a…
-
This proposes a new type of selector*, tentatively called a "reference", that serves as a way to uniquely identify a declarations block.
References are intended to enable a number of features and u…
-
The [Tootsie Pop model](http://smallcultfollowing.com/babysteps/blog/2016/05/27/the-tootsie-pop-model-for-unsafe-code/) leverages unsafe declarations to simultaneously permit aggressive optimization i…
-
**Is your feature request related to a problem? Please describe.**
Many new SOTA models for retrieval are trained with prefixes on the queries and documents, thus they expect these at inference as we…
-
```
I got this mail via the corpora list and it sound interesting
---
From: Olga Uryupina
Subject: [Corpora-List] BART coreference resolver: v2.0 released
To: "corpora@uib.no"
Dear CorporaList me…
-
Is there a way to do so-called legacy completions?
- https://platform.openai.com/docs/guides/text-generation/completions-api
- https://platform.openai.com/docs/api-reference/completions/create
``…
-
Explore capabilities of NLTK
-
Hi, I tried comparing the values from here with those from SentenceTransformers and they are not the same:
```
from sentence_transformers import SentenceTransformer
model2 = SentenceTransformer("…
-
https://aclanthology.org/2020.coling-main.498/
-
Hi,
I am using one of the "Sentence similarity" models e.g. 'distilbert-base-nli-stsb-quora-ranking'
As in my domain, I am sure I have quite unique words in my use-case.
How can I handle OOV words…