tokenization Search Results

1000+ results
for tokenization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

MaartenGr/KeyBERT #247

Chinese documents and candidates

I'm using jieba for tokenization for my Chinese documents, as suggested here in the issues and in the documentation. It also says in the documentation that if I use a vectorizer, I cannot use a candid…

bsariturk updated 1 month ago
2
xorbitsai/inference #2171

无法指定embedding或者reranker的gpu索引。都默认加载到gpu 0上。

### System Info / 系統信息 2024-08-27 04:33:13,423 xinference.core.supervisor 124341 INFO Xinference supervisor 0.0.0.0:18896 started 2024-08-27 04:33:13,500 xinference.core.worker 124341 INFO S…

chk4991 updated 2 days ago
3
huggingface/course #277

Mistake in Unigram tokenization

Hi! Is it a mistake? There should be 17 instead of 5 in the end. ![Снимок экрана 2022-07-08 в 17 41 45](https://user-images.githubusercontent.com/33065236/178014793-4c788364-c338-43e1-abb7-d33c7c4e5c…

CapBlood updated 1 year ago
2
nokitoino/DecompilerAI #1

T5AssemblyC.ipynb Tokenization Error

A brief analysis of the default Tokenizer shows: ``` print(tokenizer.decode(encoding_val['input_ids'][0])) print(input_val[0]) print(output_val[0]) print(tokenizer.decode(target_encoding_val['i…

nokitoino updated 10 months ago
2
balanced/balanced-api #576

Warn on direct tokenization

Moved from https://github.com/balanced/balanced-php/issues/84 balanced.js is the preferred method of tokenization. Consider warning marketplaces who use the API direct tokenization method.

remear updated 10 years ago
1
elitcloud/elit #8

Compound words tokenization failure

## Expected Behavior Compound words (e.g. pick-me-up, hand-me-down, know-it-all, etc.) should be tokenized as single tokens. ## Actual Behavior hyphens are treated as separators, and the componen…

cathxiao updated 7 years ago
1
InternLM/xtuner #636

ChatGLM3-6b测试模型时报错AttributeError: can't set attribute

xtuner chat /root/autodl-tmp/add --prompt-template default Traceback (most recent call last): File "/root/ChatGLM3/xtuner/xtuner/tools/chat.py", line 491, in main() File "/root/ChatGLM3/xtuner/x…

padsasdasd updated 4 months ago
1
artidoro/qlora #30

RecursionError: maximum recursion depth exceeded

I am getting maximum recursion depth error after running this following command: python qlora.py --model_name_or_path decapoda-research/llama-7b-hf And this is the error I got: File "/home/at…

atillabasaran updated 2 months ago
5
haotian-liu/LLaVA #1153

[Question] Training with Qwen2 backend got loss 0

### Question I got loss to be 0 when training on Qwen2 backend, {'loss': 0.0, 'learning_rate': 0.00015267175572519084, 'epoch': 0.0} …

lucasjinreal updated 2 months ago
43
josacar/triki #22

Leveraging on Tokenization/Lexers alongside with Regex for c…

Hello, Thank you for your hard work on this project. The tool is incredibly useful, and I appreciate your dedication. I'd like to propose having tokenization/lexers for pattern matching along si…

Ara4Sh updated 2 months ago
3

上一页 1...13 14 15 16 17 18 19...100 下一页

1000+ results for tokenization

1000+ results
for tokenization