tokenizer Search Results

1000+ results
for tokenizer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

noamgat/lm-format-enforcer #117

Question about Source Code

Hello, I would first thank you for open-sourcing such a well-designed and high-quality code base. I am reading the source code, and I have a question about this part(integrations.transformers.py…

Acatsama0871 updated 3 days ago
1
kkebo/zyphy #61

Tokenizer is slow

## Summary The current implementation of `Tokenizer` is slower than html5ever. I believe that it can become as fast as html5ever. ## The current benchmark results From #59 Running on the M…

kkebo updated 3 weeks ago
13
jumon/whisper-punctuator #9

AttributeError: 'Tokenizer' object has no attribute 'tokeniz…

I was try to use this module but i am getting this type of error `AttributeError: 'Tokenizer' object has no attribute 'tokenizer' ` ![image](https://github.com/jumon/whisper-punctuator/assets/737480…

Gokulraam2257 updated 2 months ago
1
huggingface/transformers #29159

[tokenizer] Inconsistent behavior in slow tokenizer and fast…

### System Info - `transformers` version: 4.35.2 - Platform: Linux-5.4.0-163-generic-x86_64-with-glibc2.10 - Python version: 3.8.18 - Huggingface_hub version: 0.19.4 - Safetensors version: 0.4.…

Ki-Seki updated 2 months ago
5
bmaltais/kohya_ss #2486

Can't load tokenizer

Can't load tokenizer for 'openai/clip-vit-large-patch14'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, …

YikaPanic updated 1 month ago
1
roife/emt #1

How to customize the tokenizer default behavior?

Excellent work! It seems that NLTokenizer split text but ignored the sign char . example：emt-lib-path，the out put is emt, -lib ,-path but expect is emt ，-，lib，-，path Did there have any method to c…

sincebyte updated 4 weeks ago
1
microsoft/onnxruntime-extensions #724

ValueError: Unsupported processor/tokenizer: Qwen2Tokenizer

Hi, Im trying to export my tokenizer, and followed this short guide: [Guide](https://onnxruntime.ai/docs/extensions/ ) Now, using: tokenizer = AutoTokenizer.from_pretrained(onnx_path, use_fast=Fa…

Wonder1905 updated 1 month ago
1
helpmefindaname/transformer-smaller-training-vocab #14

New Tokenizer for mdeberta-v3-base

Thank you for your repo! Would it be possible to add a tokenizer for this model? https://huggingface.co/microsoft/mdeberta-v3-base Thanks in advance :)

zynos updated 1 week ago
1
InternLM/xtuner #788

[BUG] 关于qwen2的bug报告

datasets/utils.py文件中，关于qwen2的bug，**QWen2Tokenizer** 应该改为 **Qwen2Tokenizer** `def get_bos_eos_token_ids(tokenizer): if tokenizer.__class__.__name__ in [ 'QWenTokenizer', 'QWen2Tok…

macheng6 updated 1 week ago
2
ggerganov/llama.cpp #7667

Bug: Phi-2 model tokenizer not recognized

### What happened? Despite [phi-2](https://huggingface.co/microsoft/phi-2) being listed [here](https://github.com/ggerganov/llama.cpp/blob/2e32f874e675f7bc5307cb7b4470ddbe090bab8f/README.md?plain=1#L…

saeid93 updated 1 day ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for tokenizer

1000+ results
for tokenizer