-
Hi,
Thanks a lot for this wonderful project! I tried to distill a model2vec model from `sentence-transformers/paraphrase-multilingual-mpnet-base-v2` by providing it with a vocabulary list for my do…
-
I recently [retried](https://github.com/conda-forge/sentencepiece-feedstock/pull/59) to convert a project (https://github.com/google/sentencepiece/) using protobuf to build shared libraries by def…
-
Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 23.73 out of 50.99 RAM for saving.
100%|██████████| 32/32 [00:19 4 if True: model.push_to_hub_gguf("mINE", tokenizer, quant…
-
Does it support the vocabulary segmentation trained by SentencePiece?
-
The spm.vocab file only contains mappings from "token" to "token_id".
-
I made a venv, pip installed airllm and then bitsandbytes within that venv, and then copypasted the example python code into `testme.py`. It bailed with the output below:
```
$ python testme.py
…
-
%%capture
!pip install unsloth "xformers==0.0.28.post2"
# Also get the latest nightly Unsloth!
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://gi…
-
Should `MLC_ENABLE_SENTENCEPIECE_TOKENIZER` be on by default in `CMakeLists.txt`? I had to turn it on in order to successfully run `./build_and_run.sh` to build the example target. Otherwise, I get…
-
Hi, I trained a sentencepiece tokenizer with prefix match. After convert to HF tokenizer, the tokenization result is not consistent with slow tokenizer.
In sentencepiece, we can choose whether to u…
-
I have no clue about this error.
Any suggestions?
I installed sentencepiece with transformers.
![image](https://github.com/arcee-ai/DALM/assets/64459173/c9ac892d-68ab-4ba4-840e-5af4ddd9c05e)