-
The command I run:
'''
python llama3.py --pruning_ratio 0.25 \
--device cuda --eval_device cuda \
--base_model home/Meta-Llama-3-8B \
--block_wi…
-
### Feature Description
Since the `CompleteAccessor` stores the metadata, and `Access::info` return the `Arc`, we can move logic from `fn metadata(&self) -> Arc` to `impl Layer for CompleteLayer` to …
-
### What happened?
I wanted to use the Kompute version to run on my GPU (Radeon RX570 4G) but whenever i use the `-ngl` argument to offload to GPU, `llama-cli` silently exits before loading the model…
-
Hi Andrei,
Noticed your answer in meta-sunxi (I haven't been around much this year).
Seems like there are a number of meta-chip layers out there now.
1. Yours (the one referenced from openembed…
-
Hi,
Would it be possible to move the connman configuration out of [connman_%.bbappend](https://github.com/TechNexion/meta-tn-imx-bsp/blob/mickledore_6.1.55-2.2.0-stable/recipes-connectivity/connman…
-
### Observed behavior
Hello! Docker image version: 2.10.18-alpine3.20
Client:
2.3.2
Nats JS runs in RAFT, has 3 services in docker swarm (not replicas) services with an enabled gateway to anothe…
-
### What happened?
The server chases when changing the LoRA scale and using CUDA. To reproduce it:
- Start the server with a model and a LoRA and load layers to CUDA.
- Then, prompt the mode…
-
this can be used as inspiration https://github.com/r4fek/django-cassandra-engine/pull/66
-
llama_model_loader: loaded meta data with 32 key-value pairs and 219 tensors from /data/huggingface/hub/models--city96--t5-v1_1-xxl-encoder-gguf/snapshots/005a6ea51a7d0b84d677b3e633bb52a8c85a83d9/./t5…
-
Currently, when scanning the project the extension prevents to add in recipes explorer the recipe that have been skipped by Bitbake.
As an example, if I want to add the the package `virt-manager` fr…