sentence-tokenizer Search Results

1000+ results
for sentence-tokenizer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

robinarthur/5pk #6

Go further with mining

Once NLTK is installed and you have a Python console running, we can start by creating a paragraph of text: >>> para = "Hello World. It's good to see you. Thanks for buying this book." Now we wa…

robinarthur updated 7 years ago
1
coqui-ai/TTS #3992

Finetune XTTS for new languages

Hello everyone, below is my code for fine-tuning XTTS for a new language. It works well in my case with over 100 hours of audio. https://github.com/nguyenhoanganh2002/XTTSv2-Finetuning-for-New-Lang…

anhnh2002 updated 4 days ago
18
yeyupiaoling/Whisper-Finetune #88

微调后，音频的语言检测结果不准了。

问题描述：返回结果中， dataset/test.wav的识别结果为英文内容。同时返回的检测结果显示为 'language': 'english' 具体信息： python lang-detect.py --audio_path=dataset/test.wav --model_path=models/whisper-large-v2-finetune/ Loading chec…

charleybin updated 21 hours ago
9
WanzhengZhu/GRUEN #4

batched inference for grammatical score

I noticed that the lm_score code processes a single sentence at a time. This is pretty slow if you're processing a large amount of data. I wrote a batched version, though it's a bit ugly. This increas…

Jack000 updated 2 years ago
3
agermanidis/videodigest #3

"NLTK tokenizers are missing."

https://github.com/agermanidis/videodigest I installed it 64 bit Windows 10 pro. but not worked. also installed this ubuntu virtualbox. getting this error. yl@yl-VirtualBox:~$ videodigest -i /me…

raj6996 updated 4 years ago
1
fastnlp/CPT #82

run_pretrain_bart.sh returns IndexError

Here is the stacktrace of `run_pretrain_bart.sh` error: ``` [rank0]: IndexError: Caught IndexError in DataLoader worker process 0. [rank0]: Original Traceback (most recent call last): [rank0]: F…

shivanraptor updated 1 month ago
13
nltk/nltk #2543

NLTK Sentence tokenizer does not tokenize properly if there …

Like i have sentence: 'The first approach, single-molecule simulation, taken by the StochSim simulator, tracks individual molecules and their state (e.g., what other molecules they are bound to) so t…

ZohaibRamzan updated 3 years ago
5
gazgiz/zigzag-news-insight #6

문장 단위로 기사를 나누는 작업에서 예상가능한 오류를 디버깅하는 방법

문장 단위로 기사를 나누는 작업에서 예상가능한 오류를 디버깅하는 방법은 다음과 같은 절차를 따를 수 있습니다: 1. 문장 분리 알고리즘의 선택과 적용 알고리즘 선택: Python에서는 nltk 또는 spaCy와 같은 라이브러리를 사용하여 문장을 분리할 수 있습니다. 이러한 라이브러리들은 각각 다른 방법으로 문장을 인식하므로, 사용하기 전에 각 라이브…

lotusflwrr updated 5 months ago
2
RediSearch/RediSearch #1747

Tokenization vs Guillemet

While doing some testing, I noticed that the tokenizer treats gullermets punctuation marks `«`, `»` differently from the more common `"`, `'`. Look a this string: «a sentence between guillemet». Your…

Lucabenj updated 2 years ago
1
MaartenGr/KeyBERT #172

Does kerbert going to support LLaMA?

Hi, I received an error once I change the model with `decapoda-research/llama-7b-hf`. Is this error derived from sentence-transformer? ValueError: Asking to pad but the tokenizer does not have a pa…

thtang updated 1 year ago
1

上一页 1...11 12 13 14 15 16 17...100 下一页

1000+ results for sentence-tokenizer

1000+ results
for sentence-tokenizer