chinese-vocab Search Results

DLLXW/baby-llama2-chinese #63

Problem with tokenizer?

I am writing to ask for your help with a problem I am having with the tokenizer. I have been trying to solve it for a while now, but I have been unsuccessful. However, I am having trouble with : Trac…

shokhjakhonone updated 1 month ago

tensorchord/pg_bestmatch.rs #5

feat: Support tokenizer configurations

VoVAllen updated 1 hour ago

ggerganov/llama.cpp #7877

Bug: GGML_ASSERT: ggml.c:12793: ne2 == ne02 zsh: abort …

During the process of fine-tuning LLama3 using LLama.cpp on my Mac, I encountered this error. I'm a beginner and don't know what caused this issue. I hope an expert can help me. The model used is: …

CodeBobobo updated 6 days ago

k2-fsa/sherpa-onnx #981

Hotwords encoding for phonemes

Hi. I have a phoneme-based Zipformer model. Before this [PR](https://github.com/k2-fsa/sherpa-onnx/pull/828), I was able to apply hotwords encoding for phoneme sequences, e.g. `ɪ z/dʒ ʌ s t/b ɛ s t…

w11wo updated 1 week ago

Mozilla-Ocho/llamafile #471

llamafile vs llamacpp: the results model generate are differ…

Background: Using the same gguf model with the same parameters and inputs, using -- top k=1 (greedy strategy); llamafile-0.8.6 llamacpp-b2249 When generating the first token, the distribution of lo…

chong000 updated 3 weeks ago

piskvorky/gensim #2702

Word2VecKeyedVectors.vocab.keys() broken with chinese charac…

#### Problem description A gensim model was trained under Python 2.7 with a **chinese** dataset. However, now we are using Python3.6, and we got some broken strings in .vocab.keys() as title. …

parap1uie-s updated 4 years ago

alibaba/Alink #242

XLS 文件

Xlsx 类型的文件要下载什么插件呀

king-libingke updated 2 months ago

649453932/Chinese-Text-Classification-Pytorch #102

使用自己的数据集出现问题

Loading data... Vocab size: 4762 491it [00:00, 98460.66it/s] 40it [00:00, ?it/s] 42it [00:00, 42113.50it/s] Traceback (most recent call last): File "C:\Users\dell\Desktop\Chinese-Text-Classifi…

670619720 updated 1 month ago

xorbitsai/inference #708

QUESTION: Chinese garbled code problem

Note that the issue tracker is NOT the place for general support. I deployed a model, but encountered the problem of garbled Chinese, what is the reason? such as: this is my model.json ```js…

xiaolibuzai-ovo updated 6 days ago

NVIDIA/Megatron-LM #312

[Question] How to generate a merge file and a vocab file

I want to use the Megatron framework for Chinese NLP pre-training tasks. Currently, I have Chinese corpus resources and a vocab.txt file. However, for most frameworks, it seems that vocab.json and mer…

Zhang-kg updated 7 months ago

1000+ results for chinese-vocab

1000+ results
for chinese-vocab