-
emojis = {"😍":1, "😊":2,"💪":3, "🙇":4, "🙏":5, "🙆":6, "😉":7, "🎉":8, "❗":9, "💦":10}
```
sample_nums: [('2', 53592), ('10', 38038), ('9', 26034), ('1', 22109), ('4', 17432), ('5', 16174), ('7', 13489), (…
-
In some rare situations, specific sentences translated from Italian to the English language with "(Translated with Google Translate)" at the end of the output sentence. For example, the following It…
-
## 🐛 Bug
I would be expecting the following properties of BERTscore:
1) given a single list of sentences, and comparing all pairs as preds and targets, BERTscore should be maximum when the same se…
-
Hi,
I am trying to mine some parallel sentences from two large monolingual corpora (over 40M sentences each). In the first step I encoded the two sides and then called `mine_bitexts.py` to do the mag…
-
# Pronunciation scoring for non-native English
This task is to perform a pronunciation scoring of non-native speakers of English. Pronunciaiton scoring is important in computer-assisted language le…
-
Dear Author, Hi, I have a question that is confusing me, in the original work of PURE, the performance of Rel and Rel+ on SciERC dataset is 50.1 and 36.8 respectively, what is the metric you are using…
-
The schema requires entries (gene pair units) with the following (minimum) information:
genePairId | gene1 | gene2 | docid | scope
- options for scope: document > paragraph > sentence > event | …
-
I am trying to finetune bge embedding model for my custom dataset. I have used both MNR and CachedMNR loss function , but I am not getting any training or validation loss value while training , it pri…
-
While cross encoders have shown better performance than using cosine similarity scores on sentence embeddings, there are no multilingual cross encoders, making this solution only viable for English. E…
-
How does the MultipleNegativesRankingLoss function when used with gradient accumulation steps?
According to the [docs](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mult…