learned-tokenization Search Results

279 results
for learned-tokenization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ggerganov/llama.cpp #2310

Tokenization is not equal to Meta's tokenization.

I'm comparing the tokenization between original Meta repo and llama.cpp with LLaMA (also had same issue with LLaMA v2). For example, tokenizing the prompt "Hello world" and " Hello world" gives the…

viniciusarruda updated 1 year ago
24
tc39/proposal-record-tuple #10

Collect developer feedback about the ergonomics of `#{ }`/`#…

**Notice by @rricard:** This issue is now only open to discuss the following alternatives: `#{ }`/`#[ ]`, `@{ }`/`@[ ]`, ~~`{| |}`/`[| |]`~~ or `{| }`/`[| ]` --- _Original issue text:_ Most p…

littledan updated 7 months ago
480
SKTBrain/KoBERT #106

제발 도와주세요 개발환경 문제 [BUG]

## 🐛 Bug No module named 'kobert' No module named 'glounnlp' colab The code is not running after the update. 코랩 업데이트 이후 코드 실행이 안되고 있습니다. 이전에는 문제 없이 정상작동을 했던 코드입니다. ## To Reproduce 버그를 …

cwoonb updated 1 year ago
4
flairNLP/flair #2912

Spacing injected by Flair into Named Entities skewing report…

When doing sequence tagging for Named Entities, Flair is injecting spaces around punctuation inside the Span itself (which I suspect this is due to the tokenization being applied). I previously report…

None-Such updated 1 year ago
1
invoke-ai/InvokeAI #1887

Textual Inversion Instructions for Mac

### Is there an existing issue for this? - [X] I have searched the existing issues ### Contact Details twitter @joshdance ### What should this feature add? Reading the docs here - https://invoke-…

joshdance updated 1 year ago
19
OpenNMT/Tokenizer #313

A strange segmentation occurs with a Thai example.

Is this the expected behavior ? ```python import pyonmttok th_string = "คุณอาจจะทำอย ่ างนั ้ นไปซักพัก จนคุณเริ ่ มจะรู ้ สึกถึงมันจริงๆ" tokenizer = pyonmttok.Tokenizer("aggressive", joiner_anno…

l-k-11235 updated 1 year ago
6
microsoft/vscode #162320

Truncated lines are not obvious in the text editor

Testing microsoft/vscode-jupyter#11463, I (re)learned today that we truncate long lines. My setup: 1) Open insiders.vscode.dev 1) Install Jupyter and Python extensions 1) Create a new untitled j…

kieferrm updated 1 year ago
1
huggingface/tokenizers #247

How to add some new special tokens to a pretrained tokenizer…

Hi guys. I want to add some new special tokens like `[XXX]` to a pretrained `ByteLevelBPETokenizer`, but I can't find how to do this in python.

ky941122 updated 1 year ago
27
joelansbro/nlp-methods #1

Draft NLP Methods

I am creating a repo with some draft NLP methods that will assist me with the further work of the text processing later on down the road. The ones I will cover for sure are: * Tokenization * Basic Pre…

joelansbro updated 2 years ago
1
zhongkaifu/Seq2SeqSharp #67

Issues to get started with "Seq2SeqClassificationConsole"

I have issues to get started with setting up a demo for the Seq2SeqClassificationConsole app. I assume as a NewBe I do not setup the training data correctly. Can you please point me the way to setup …

TodayAI updated 1 year ago
33

上一页 1...14 15 16 17 18 19 20...28 下一页

279 results for learned-tokenization

279 results
for learned-tokenization