chinese-vocab Search Results

1000+ results
for chinese-vocab

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/onnxruntime-extensions #638

CLIPTokenizer differs from HuggingFace on Chinese inputs

I've exported `openai/clip-vit-base-patch32` from HuggingFace into a single op ONNX model which uses `CLIPTokenizer`. When comparing the behaviour to the original HF tokenizer I'm seeing an issue with…

Craigacp updated 5 months ago
2
Vincent131499/Language_Understanding_based_BERT #1

关于mask lm

您在掩码中文词时的实现如下: ``` for index in index_set: covered_indexes.add(index) masked_token = None # 80% of the time, replace with [MASK] if rng.random() < 0.8: mas…

andiShan11 updated 2 years ago
1
PaddlePaddle/PaddleNLP #5461

[Bug]: 只要一进入调试模式就会出现TypeError: __init__() got multiple value…

### 软件环境 ```Markdown windows10 X64 anaconda-spyder5.2.2 - paddlepaddle-gpu: 2.3.2 - paddlenlp: 2.0.1 ``` ### 重复问题 - [x] I have searched the existing issues ### 错误描述 ```Markdown 只要一进入调试模式就会出现Ty…

HelloBroTan updated 2 months ago
1
LlamaFamily/Llama-Chinese #153

尝试运行Llama2-Chinese-7b-Chat报错

今天解决了python -m bitsandbytes的问题，随之而来的就是新报错： PS F:\新建文件夹> python .\Llama2-Chinese\examples\chat_gradio.py --model_name_or_path .\Llama2-Chinese-7b-Chat\ bin C:\Users\46045\AppData\Local\Programs\Pyt…

fenfenyangyangmate updated 5 months ago
4
PabloRomanH/zhongzhong #63

New functionality: using shortcut to add vocab to external s…

working on a site for people to log their Chinese progress (videos watched, articles read, etc). Rather than create my own plugin, I thought it would make more sense to build on top of yours (which I …

tvanzo updated 4 months ago
3
LeeSureman/Flat-Lattice-Transformer #126

FileNotFoundError: [Errno 2] No such file or directory:

Traceback (most recent call last): File "E:\yan\chong\daimaDemo\Flat-Lattice-Transformer-master\Flat-Lattice-Transformer-master\V0\flat_main.py", line 290, in datasets, vocabs, embeddings = e…

C929-x updated 11 months ago
4
eguilg/nl2sql #14

第一次接触pytorch和bert模型，有几个问题需要请教一下，程序已经跑起来了，但是有几个问题

修改了三处地方： 1、model\chinese-bert_chinese_wwm_pytorch\config.json, 其中vocab_size的值改为 30522 2、code\sqlnet\model\sqlbert.py,大约141行附近，增加三行：sel_col_mask = sel_col_mask - 254；where_col_mask = where_col_m…

shaneNo1 updated 3 months ago
5
google-research/bert #171

Questions about pretraining

Hi, I have some questions about pre-training as follows: 1. I wanna train my own model from scratch and produce the `vocab.txt` by characters. There are some low-frequency words, should low-frequenc…

htw2012 updated 5 years ago
4
growvv/emo_is_all_you_need #2

您好！关于您分享的百度网盘过期了

您在 [emo_is_all_you_need](https://github.com/growvv/emo_is_all_you_need) 中分享的存档预训练模型: hfl_chinese_roberta_wwm_ext（修改了vocab.txt）的百度网盘过期了，请问您能在上传一遍吗？我最近在学习您的代码，十分感谢！

Onionzdc updated 1 year ago
4
QwenLM/Qwen2 #720

How do I remove tokens from the tokenizer?

Hi there! I need to remove specific tokens (certain Chinese tokens) from the Qwen2Tokenizer, and I am not quite sure how to do so. I have tried various methods, shown below, but to no avail. ## …

ForBo7 updated 4 days ago
2

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for chinese-vocab

1000+ results
for chinese-vocab