korean-tokenizer Search Results

373 results
for korean-tokenizer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

squidfunk/mkdocs-material #4714

🚀 Material for MkDocs 9 – Beta

__With the help of our awesome sponsors, I'm happy to announce that the '[Carolina Reaper](https://squidfunk.github.io/mkdocs-material/insiders/#10000-carolina-reaper)' funding goal has been reached, …

squidfunk updated 1 year ago
39
apache/lucene #9828

Nori(Korean) tokenizer removes the decimal point. [LUCENE-87…

This is the same issue that I mentioned to unlike standard analyzer, nori analyzer removes the decimal point. nori tokenizer removes "." character by default. In this case, it is difficult to inde…

asfimport updated 2 years ago
17
apache/lucene #9278

Nori, a Korean analyzer based on mecab-ko-dic [LUCENE-8231]

There is a dictionary similar to IPADIC but for Korean called mecab-ko-dic: It is available under an Apache license here: https://bitbucket.org/eunjeon/mecab-ko-dic This dictionary was built with MeC…

asfimport updated 2 years ago
62
oobabooga/text-generation-webui #177

GPTQ quantization(3 or 4 bit quantization) support for LLaMa

[GPTQ](https://arxiv.org/abs/2210.17323) is currently the SOTA one shot quantization method for LLMs. GPTQ supports amazingly low 3-bit and 4-bit weight quantization. And it can be applied to LLaMa. …

qwopqwop200 updated 1 year ago
215
Babelscape/wikineural #2

Problems with the Training Japanese dataset

The error is `RuntimeError: split_with_sizes expects split_sizes to sum exactly to 1564 (input tensor's size at dimension 0), but got split_sizes=[21, 9, 18, 24, 27, 18, 36, 16, 38, 14, 24, 39, 7, 6,…

ZimingDai updated 2 years ago
3
apache/lucene #10009

KoreanTokenizer should split unknown words on digits [LUCENE…

Since #9594 the Korean tokenizer groups characters of unknown words if they belong to the same script or an inherited one. This is ok for inputs like Мoscow (with a Cyrillic М and the rest in Latin) b…

asfimport updated 2 years ago
18
huggingface/transformers #18501

Wav2Vec 2.0 model output logits related audio pad?

### System Info ubuntu 18.04 python 3.6, 3.9 transformers 1.18.0 ### Who can help? @patrickvonplaten, @anton-l ### Information - [X] The official example scripts - [ ] My own modifie…

YooSungHyun updated 1 year ago
11
apache/lucene #11452

Update Korean Dictionary for Nori [LUCENE-10416]

For Nori - Korean analyzer, there is Korean dictionary named mecab-ko-dic, which is available under an Apache license here: The dictionary hasn't been updated in Nori although it has some upd…

asfimport updated 2 years ago
11
apache/lucene #11097

Assertion error in JapaneseTokenizer / KoreanTokenizer backt…

There is a rare case which causes an AssertionError in the backtrace step of JapaneseTokenizer that we (Amazon Product Search) found in our tests. If there is a text span of length 1024 (determined b…

asfimport updated 2 years ago
13
apache/lucene #9499

Add example settings to Korean analyzer components' javadocs…

Korean analyzer (nori) javadoc needs example schema settings. I'll create a patch. --- Migrated from [LUCENE-8453](https://issues.apache.org/jira/browse/LUCENE-8453) by Tomoko Uchida (@mocobeta), …

asfimport updated 2 years ago
17

上一页 1...17 18 19 20 21 22 23...38 下一页

373 results for korean-tokenizer

373 results
for korean-tokenizer