sentence-tokenizer Search Results

1000+ results
for sentence-tokenizer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pkulcwmzx/knowledge-boundary #2

Issues about reproducing the baseline and search experiments

There are two issues when reproducing the experiments. ## baseline (P-zero) The P-zero acc of LLaMA2 on KAssess is only 35.22%, much lower than the paper's 50.00%. (`sampling_params = SamplingPara…

BUGLI27 updated 1 week ago
1
OpenNMT/CTranslate2 #1781

Difference translation result after convert to ctranslate

Hi. I have finetune the _Helsinki-NLP/opus-mt-zh-vi_ model for translating Chinese to Vietnamese. When I convert the model to ctranslate2, the performance is decrease (from 32 sacrebleu with transform…

hieunguyenquoc updated 1 month ago
2
estnltk/estnltk #122

Incompatibility with NLTK 3.8.2 (punkt_tab)

NLTK version 3.8.2 changed the data format of the tokenizers from pickle to text files in order to patch a vulnerability (CVE-2024-39705). Here's the PR in the nltk repo: https://github.com/nltk/n…

peterboost updated 2 months ago
1
explodinggradients/ragas #1247

Facing error for evaluate() for Langchain instance LLM and E…

[ ] I checked the [documentation](https://docs.ragas.io/) and related resources and couldn't find an answer to my question. ** Facing error with using Langchain wrapped hugging face models** I am …

kartik-angadi updated 1 month ago
1
milvus-io/milvus #36751

[Bug]: Creating a collection still succeeds even when using …

### Is there an existing issue for this? - [X] I have searched the existing issues ### Environment ```markdown - Milvus version:zhengbuqian-doc-in-restful-d174d05-20241010 - Deployment mode(standa…

zhuwenxing updated 5 days ago
4
huggingface/peft #2161

Prompt Tuning Crash with Llama-3.2 in torch.embedding

### System Info peft==0.13.2 accelerate==1.0.1 torch==2.4.0 peft_config ```python peft_config = PromptTuningConfig( task_type=TaskType.CAUSAL_LM, prompt_tuning_init=PromptTuningI…

hrsmanian updated 6 days ago
4
opensearch-project/k-NN #2113

[FEATURE] inner_hits in nested neural query should return al…

### What is the bug? I am using text_chunking and text_embedding processor to ingest documents into an index. The [text_chunking search example](https://opensearch.org/docs/latest/search-plugins/text…

yuye-aws updated 1 month ago
13
NaturalNode/natural #294

Question: Punkt Sentence Tokenizer/Segmentation

Read the blog post here: https://dzone.com/articles/using-natural-nlp-module Wondering if `Punkt sentence segmentation` as been added? I don't seet iin README. If not, I can take a crack at it. Or …

pthieu updated 6 years ago
1
KoljaB/RealtimeTTS #106

Question about tokenizer

Is there any way to adjust tokenizer parameters that how the tokenizer(?) divides the sentences? May I ask how sentence-splitting is done when the program is configured to (being~by) feed generator it…

FivespeedDoc updated 3 months ago
11
segment-any-text/wtpsplit #130

Questions concerning configuring train_lora.py for custom co…

Hi! I followed the instructions for fine-tuning my corpus and (I think) managed to do so successfully after days of debugging. I have A LOT of implementation questions and the following is half-guide,…

eshau updated 6 days ago
8

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for sentence-tokenizer

1000+ results
for sentence-tokenizer