korean-tokenizer Search Results

392 results
for korean-tokenizer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

allenai/allennlp #5582

Loading a HuggingFace model into AllenNLP gives different pr…

## Checklist - [x] I have verified that the issue exists against the `main` branch of AllenNLP. - [x] I have read the relevant section in the [contribution guide](https://github.com/allenai/al…

santiagxf updated 2 years ago
12
navervision/KELIP #1

KELIP guided diffusion

Thanks for the KELIP! If KELIP can be used with the diffusion model (like CLIP) in Korean, it will be very interesting. I tried [CLIP Guided Diffusion](https://colab.research.google.com/drive/12a_W…

epicure updated 2 years ago
5
huggingface/transformers #15153

It is better to add a function to train additon tokens for t…

https://github.com/huggingface/transformers/blob/96881729ce83cfc8e5fa04c903ee4296ad17cfbb/src/transformers/models/bert/tokenization_bert.py#L117 Lately, I use bert to train a NER model for Chinese…

zhangbo2008 updated 2 years ago
11
getzola/zola #1930

Cannot build zola 0.16.0 from the source code with the `inde…

# Bug Report ## Environment Zola version: 0.16.0 `rustc` version: 1.58.0 Cargo version: 1.58.0 ## Expected Behavior Zola 0.16.0 compiles with the `indexing-ja` feature ## Current Behavi…

toku-sa-n updated 2 years ago
4
facebookresearch/ParlAI #4585

blenderbot2.0 searching engine related question

When I read the paper and saw the model structure image, I understood that DPR and internet search can be done together. But while looking through the code, this question came to me. "Is it possib…

daje0601 updated 2 years ago
12
apache/lucene #2703

contrib intelligent Analyzer for Chinese [LUCENE-1629]

I wrote a Analyzer for apache lucene for analyzing sentences in Chinese language. it's called "imdict-chinese-analyzer", the project on google code is here: http://code.google.com/p/imdict-chinese-ana…

asfimport updated 2 years ago
64
JohnSnowLabs/spark-nlp #2846

LaBSE Sentence Embeddings model output vectors are not equal…

I’am using embeddings from example https://nlp.johnsnowlabs.com/2020/09/23/labse.html and output vectors although close, but not equal to original vectors https://tfhub.dev/google/LaBSE/1 Why? How o…

Fikavec updated 2 years ago
14
mikemccand/stargazers-migration-test #963

KoreanTokenizer should split unknown words on digits [LUCENE…

Since https://issues.apache.org/jira/browse/LUCENE-8548 the Korean tokenizer groups characters of unknown words if they belong to the same script or an inherited one. This is ok for inputs like Мoscow…

mikemccand updated 2 years ago
18
aalto-speech/morfessor #23

Is the tokenizer.model deterministic?

Hi, I'm developing a tokenizer based on Korean. Since my project is to develop a language model using SRILM's `ngram`, the role of tokenizer is very important. I couldn't experiment because of the l…

somniumism updated 3 years ago
1
mikemccand/stargazers-migration-test #630

How Nori Tokenizer can deal with Longest-Matching [LUCENE-86…

I think... Nori tokenizer has one issue. I don’t understand why “Longest-Matching” is NOT working to Nori tokenizer via config mode (config mode: Here is an example for explaining what is longe…

mikemccand updated 2 years ago
5

上一页 1...22 23 24 25 26 27 28...40 下一页

392 results for korean-tokenizer

392 results
for korean-tokenizer