-
The README.md (and the corresponding landing page of the documentation) contains non-functional code. Specifically the following:
### Loading a pretrained tokenizer from the Hub
```rust
use token…
-
Is the image tokenizer for the secondary image already created, or should we create it in the finetune file:
```
config["model"]["observation_tokenizers"]["top"] = ModuleSpec.create(
Ima…
HM102 updated
1 month ago
-
### System Info
- `transformers` version: 4.35.2
- Platform: Linux-5.4.0-163-generic-x86_64-with-glibc2.10
- Python version: 3.8.18
- Huggingface_hub version: 0.19.4
- Safetensors version: 0.4.…
-
### Your current environment
docker pull vllm/vllm-openai:v0.6.2
### Model Input Dumps
docker run --runtime nvidia --gpus all \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "H…
-
**Describe the bug**
Trying to use ollama but failed with `Error: No tokenizer found for model`, tried to change the model but still same error.
**To Reproduce**
```
[openai]
api_base = "ht…
-
Hi,
When I add an out-of-vocab character to a tokenizer, I am only able to get the new token ID when I encode it as a whole word, but not as a subword. Is there a parameter that I need to add to the …
-
Hi, I'm processing the huge bread-midi-dataset.
Training the tokenizer throws a KeyError in midi_tokenizer.py
line 1712: ids = [self.vocab[token] for token in tokens]
maybe line 1709 needs a f…
-
I changed the `llm_model_path` to 'yentinglin/Llama-3-Taiwan-8B-Instruct'. Then the bug happened. It seems that the Llama-3-Taiwan-8B-Instruct tokenizer.json does not contain "". GFD is based on "byte…
-
when running
`python lhrs_webui.py -c Config/multi_modal_eval.yaml --server-port 8000 --server-name 127.0.0.1 --share`
I see this error
```
/home/.pyenv/versions/lhrs/lib/python3.10/site-pa…
-
create diagrams that explains what is being done in each architecture and each tokenizing strategy