sentencepiece Search Results

1000+ results
for sentencepiece

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

MinishLab/model2vec #135

Vocabulary option for models with sentencepiece tokenizer.

Hi, Thanks a lot for this wonderful project! I tried to distill a model2vec model from `sentence-transformers/paraphrase-multilingual-mpnet-base-v2` by providing it with a vocabulary list for my do…

trpstra updated 2 days ago
2
protocolbuffers/protobuf #19422

ENH: automatically create macro-definition for `EXPORT_MACRO…

I recently [retried](https://github.com/conda-forge/sentencepiece-feedstock/pull/59) to convert a project (https://github.com/google/sentencepiece/) using protobuf to build shared libraries by def…

h-vetinari updated 2 days ago
3
unslothai/unsloth #1198

Mistral Instruct v3 `sentencepiece_model.proto` error

Unsloth: Merging 4bit and LoRA weights to 16bit... Unsloth: Will use up to 23.73 out of 50.99 RAM for saving. 100%|██████████| 32/32 [00:19 4 if True: model.push_to_hub_gguf("mINE", tokenizer, quant…

CurtiusSimplus updated 2 weeks ago
25
huggingface/swift-transformers #109

SentencePiece

Does it support the vocabulary segmentation trained by SentencePiece?

beginner-byte updated 4 months ago
1
Jingjing-NLP/VOLT #30

SentencePiece .vocab file lacks frequency information

The spm.vocab file only contains mappings from "token" to "token_id".

hitcslj updated 1 month ago
1
lyogavin/airllm #200

No module named 'sentencepiece' when following install instr…

I made a venv, pip installed airllm and then bitsandbytes within that venv, and then copypasted the example python code into `testme.py`. It bailed with the output below: ``` $ python testme.py …

drdozer updated 3 weeks ago
1
unslothai/unsloth #1266

Couldn't build proto file into descriptor pool! Invalid prot…

%%capture !pip install unsloth "xformers==0.0.28.post2" # Also get the latest nightly Unsloth! !pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://gi…

CurtiusSimplus updated 2 weeks ago
9
mlc-ai/tokenizers-cpp #45

Why MLC_ENABLE_SENTENCEPIECE_TOKENIZER OFF by default?

Should `MLC_ENABLE_SENTENCEPIECE_TOKENIZER` be on by default in `CMakeLists.txt`? I had to turn it on in order to successfully run `./build_and_run.sh` to build the example target. Otherwise, I get…

korciuch updated 5 days ago
5
huggingface/tokenizers #1682

Mismatch between slow and fast tokenizer

Hi, I trained a sentencepiece tokenizer with prefix match. After convert to HF tokenizer, the tokenization result is not consistent with slow tokenizer. In sentencepiece, we can choose whether to u…

KaiLv69 updated 1 week ago
2
arcee-ai/DALM #89

cannot import name '_sentencepiece' from partially initializ…

I have no clue about this error. Any suggestions? I installed sentencepiece with transformers. ![image](https://github.com/arcee-ai/DALM/assets/64459173/c9ac892d-68ab-4ba4-840e-5af4ddd9c05e)

Aekansh-Ak updated 2 months ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for sentencepiece

1000+ results
for sentencepiece