tinyllama Search Results

1000+ results
for tinyllama

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

nod-ai/SHARK-Turbine #713

Does SHARK LLM support q4/q8 matrix multiplication?

Hi, I followed the instructions [here ](https://github.com/nod-ai/SHARK-Turbine/tree/main/models/turbine_models/custom_models) to compile llama model into .vmfb. I specified the quantization to 4bit…

rednoah91 updated 2 weeks ago
1
huggingface/tokenizers #1539

Memory leak for large strings

This snippet will cause memory usage to rise indefinitely: ```python from transformers import AutoTokenizer import gc tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v…

noamgai21 updated 1 week ago
14
transformerlab/transformerlab-app #133

Add 4-bit MLX models of small models

Maybe some of TinyLlama, Phi, Qwen2 small models?

dadmobile updated 4 weeks ago
2
devoxx/DevoxxGenieIDEAPlugin #206

Support for JLama

[Jlama](https://github.com/tjake/Jlama) is a fast modern Java library for running many LLMs. Jlama is built on Java 21 and utilizes the [Panama Vector API](https://openjdk.org/jeps/448) for fast infe…

stephanj updated 2 weeks ago
1
celo-org/blockscout #1074

Data

https://lightning.ai/khaliq88/vision-model/studios/prepare-the-tinyllama-1t-token-dataset/terminal?fullScreen=true

Khaliq88 updated 1 month ago
1
google-ai-edge/ai-edge-torch #148

tflite op ( attention, or kv update)

### Description of the bug: hi i have transfer tinyllama to tflite format, but when i use `https://netron.app/`, it can show to customer op, can i know the usage in tensorflow. when don't expand it…

nigelzzz updated 1 week ago
2
huggingface/lighteval #277

[BUG] community_tasks not working or example is broken

## Describe the bug On the README page is an example to run community_tasks. It is: ```bash lighteval accelerate \ --model_args "pretrained=HuggingFaceH4/zephyr-7b-beta" \ --use_chat_…

PhilipMay updated 1 week ago
1
vitoplantamura/OnnxStream #15

OnnxStream and TinyLlama?

I was just wondering but would the methods used in OnnxStream further benefit a tiny language model like [TinyLlama?](https://github.com/jzhang38/TinyLlama). Just wanted to know how far resource usage…

stl3 updated 11 months ago
1
intel/intel-npu-acceleration-library #111

Why Llama series int4 quantization has fp16 attention layers…

**Describe the bug** When loading TinyLlama or Llama-3-8B with dtype=int4, the model structure looks: ``` LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(128256, 4096) …

kyang-06 updated 1 week ago
2
jiaweizzhao/GaLore #47

IndexError: tuple index out of range

Hi Jiawei, I was trying Galore on TinyLlama-1B using the codebase https://github.com/jzhang38/TinyLlama on 4* A800-80GB. I encounter the following error: ``` [rank1]: optimizer.step() …

zyushun updated 2 months ago
10

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for tinyllama

1000+ results
for tinyllama