tokenization Search Results

1000+ results
for tokenization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

haotian-liu/LLaVA #1153

[Question] Training with Qwen2 backend got loss 0

### Question I got loss to be 0 when training on Qwen2 backend, {'loss': 0.0, 'learning_rate': 0.00015267175572519084, 'epoch': 0.0} …

lucasjinreal updated 2 months ago
43
openlm-research/open_llama #61

tokenization issue for code

Does this still a bug for tokenization? I want to use this for code. Thanks!

brando90 updated 1 year ago
7
ystemsrx/Qwen2-Boundless #9

又又出错了

Some parameters are on the meta device because they were offloaded to the cpu and disk. Traceback ( most recent call last): File "C:\ Users\15729\ Downloads\Qwen2- Boundless- main\Qwen2- Boundless- …

jihnbnSS updated 2 days ago
2
PolMine/GermaParl2 #10

Tokenization of large(r) digital numbers

### Preliminary Remark The observations presented here are also relevant for the _polmineR repository._ ### Some Background The _Bundestag Protokolle_ often employ spacing to enhance readability …

Freschler updated 5 months ago
1
josacar/triki #22

Leveraging on Tokenization/Lexers alongside with Regex for c…

Hello, Thank you for your hard work on this project. The tool is incredibly useful, and I appreciate your dedication. I'd like to propose having tokenization/lexers for pattern matching along si…

Ara4Sh updated 2 months ago
3
dotnet/machinelearning #6993

[Tokenizers] Port CLIP Tokenizer

Port CLIP tokenizer which leverages byte-level BPE. This tokenizer enables scenarios like StableDiffusion May be dependent on https://github.com/dotnet/machinelearning/issues/6992. Reference: h…

ericstj updated 4 weeks ago
1
unfoldingWord/translationCore #4486

User Defined Tokenization Refinement

## User Story As a speaker of a minority language in the Philippines that uses a `-` as a letter, I want to be able to customize the tokenization of tC so that many of the words in my language are …

benjore updated 6 years ago
3
MagicStack/MagicPython #235

Tokenization for match-case

Originally from @SNvMK in https://github.com/microsoft/vscode/issues/120734 So, in python 3.10, there is match/case syntax. Currently, it is just white words(for monokai). I'd like if you add high…

alexr00 updated 1 year ago
3
nyu-mll/jiant #1156

Tokenization error for QASRL

**Describe the bug** Error when tokenizing training data: ``` QASRLTask [train]: /scratch/bowman/IRT_Experiments/jiant-2/experiments/tasks/data/qasrl/train.jsonl.gz [val]: /scratch/bowman/IRT…

claravania updated 3 years ago
2
UniversalDependencies/docs #644

Tokenization of French (contractions)

Annotations of contractions (mainly *au*, *aux*, *du* and *des*) are not consistent among French treebanks. Whereas *au* and *aux* are easy to manage as multiword tokens ([Tokenization and Word Seg…

bguil updated 4 years ago
4

上一页 1...14 15 16 17 18 19 20...100 下一页

1000+ results for tokenization

1000+ results
for tokenization