-
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Output from Hugging Face Transformer APIs on local env
```
You are an oracle who knows the anwer…
-
### System Info
```shell
System: IBM Power10
`5.14.0-362.13.1.el9_3.ppc64le`
OS: RHEL 9.3
Framework versions:
optimum==1.16.2
transformers==4.36.2
torch==2.0.1
onnx==1.13.1
onnxruntime…
-
Great job! I run it locally on macOS, and the speed is very fast, and the output results are relatively good. I want to try it on an Android phone, but it seems that it is not suitable for now.
-
**Describe the bug**
Error while serving a toy model using EasyDel with the following exception.
**To Reproduce**
Context
* Using Google TPU VM
* Follow the instructions as suggested here ht…
-
Why there are no instructions to use with open source embedding models? The notebook does not work because you can not create docs/data and fill it unless you are paying for OpenAI api.
-
The current code in DataCollatorForCompletionOnlyLM assumes that the first deteced occurence of `instruction_template` comes before the first detected occurence of `response_template`. This is reasona…
-
`ggml.c:4772` -> `b->type == GGML_TYPE_I32`
Using TinyLlama-v1.0-Q5. The same model works on android and PC with no issues.
-
**Integrate the tinyLlama chat models with transformers.js**
TinyLlama's a tiny and strong text generation/chat model
its useful for many applications:
- Assisting speculative decoding of larg…
-
I am trying to fine-tune Llama 2 7B with QLoRA on 2 GPUs. From what I've read SFTTrainer should support multiple GPUs just fine, but when I run this I see one GPU with high utilization and one with al…
-
I'm trying to setup activation checkpointing for training larger model. I've adjusted the code to:
```
# tinyllama.py
strategy = FSDPStrategy(
auto_wrap_policy={Block},
…