sentence-tokenizer Search Results

1000+ results
for sentence-tokenizer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

UKPLab/sentence-transformers #1831

RoBERTa ANCE FirstP forces lowercasing

## Description I was experimenting with the `sentence-transformers/msmarco-roberta-base-ance-firstp` model and observed some discrepancies between the outputs of the tokenizer depending on how the …

atreyasha updated 1 year ago
1
google-research/bert #1286

How to predict the probability of an empty string using BERT

Suppose we have a template sentence like this: - "The ____ house is our meeting place." and we have a list of adjectives to fill in the blank, e.g.: - "yellow" - "large" - "" Note that on…

brienna updated 5 months ago
1
abhimishra91/transformers-tutorials #22

Tokenization issue in transformer NER

In your custom data loader: ```python class CustomDataset(Dataset): def __init__(self, tokenizer, sentences, labels, max_len): self.len = len(sentences) self.sentences = sen…

mukesh-mehta updated 3 years ago
4
maitrix-org/llm-reasoners #72

EOS token '\n' not working properly in llama3

There is an issue with '\n' not working properly in llama3. When passing '\n' through tokenizer.encode, it outputs the token ID 198, but it does not terminate the sentence generation appropriately and…

sjaelee25 updated 5 months ago
1
LlamaFamily/Llama-Chinese #323

词汇表里没看到中文

public static void utf8ToGbk() throws Exception { String fileName = "c:/tokenizer.json"; List lines = Files.readAllLines(Paths.get(fileName), Charset.forName("utf-8")); String sentenc…

SidneyLann updated 6 months ago
4
simjanos-dev/LinguaCafe #178

Mark German separable verbs that belong together.

It is possible to mark verbs in German that has a prefix in the tokenizer python script. If a word is marked, and has the same lemma as another word in the same sentence, I think they 99% belong t…

simjanos-dev updated 2 months ago
6
UKPLab/sentence-transformers #1816

Using tf-idf weights on tokens

Hi, I want to check if combination with tf-idf weights and tokens embeddings is better representation for my use case/data(I would love to know what you think about it). Searching for implementation…

dabasmoti updated 7 months ago
3
abetlen/llama-cpp-python #1611

Add support for croos-encoders

perpendicularai updated 2 months ago
3
yym6472/ConSERT #23

How to use the model with sentence-transformer for inference…

Cannot load the model. code from sentence_transformers import SentenceTransformer model = SentenceTransformer("../../models/consbert/unsup-consert-base-atec_ccks") # the model path Error messag…

qhd1996 updated 2 years ago
2
Meelfy/pytorch_pretrained_BERT #2

Fine tuned bert LM

Hi, I use `pytorch_pretrained_BERT/examples/python run_lm_finetuning.py` to fit the model with monolingual set of sentences. I use bert multilingual cased model. Once the model is fine-tuned, I g…

zparcheta updated 4 years ago
2

上一页 1...8 9 10 11 12 13 14...100 下一页

1000+ results for sentence-tokenizer

1000+ results
for sentence-tokenizer