-
**Submitting author:** @hauselin (Hause Lin)
**Repository:** https://github.com/hauselin/ollama-r
**Branch with paper.md** (empty if default branch): joss
**Version:** v1.2.0.9000
**Editor:** @crverno…
-
- Paper name: Automatic Instruction Evolving for Large Language Models
- ArXiv Link: https://arxiv.org/pdf/2406.00770
To close this issue open a PR with a paper report using the provided [report…
-
e.g. with reference to deep speed code in TOFU codebase
What type of training? (full fine-tuning vs. PEFT etc.)
-
-
# Revolutionize Animation: Build a Digital Human with Large Language Models
A Step-by-Step Guide to Creating Your Next AI-Powered Avatar
[https://monadical.com/posts/build-a-digital-human-with-large…
-
When I run a model like `codellama:34b-code-q6_K` it does seem to spin up my GPUs but then I end up with unusable output. Running latest ollama and extension version on Ubuntu 22.04
> 2024-03-12 01…
oc013 updated
1 month ago
-
**Why**
Llama 3.1 405b just released and one of the only way to get access to it is through AzureAI hosting it.
**Description**
Support AzureAI for non OpenAI models as a cloud service.
**Req…
-
**Is your feature request related to a problem? Please describe.**
I’m facing an issue when deploying large models in Kubernetes, especially when the pod’s ephemeral storage is limited. Triton Infere…
-
### Your current environment
The output of `python collect_env.py`
```text
When I was using vllm to launch the Qwen2-VL model service, I configured the parameter --enable-prefix-caching. An err…
-
### System information
1.16.2
### What is the problem that this feature solves?
Allows for extracting sub-models form a large model (>2GB). When using this function (both with the loaded mode…