-
**LocalAI version:** v1.25.0-40-g5661740 (56617409903bde702699a736530053eb4146aec8)
**Environment, CPU architecture, OS, and Version:** MacOS M1 Max Pro
**Describe the bug**
`llama-2-chat-m…
-
Efficient Streaming Language Models with Attention Sinks [paper](https://arxiv.org/abs/2309.17453)
These repo has already implemented it:
[attention_sinks](https://github.com/tomaarsen/attention_si…
-
Hello
please find here attached supporting files and an issue description as summarised here below.
Best regards
Attilio Brighenti
S.A.T.E. Systems and Advanced Technologies Engineering S.r.l.
Ve…
-
https://iancoleman.io/bip39/
You will need to study how wallets are created in the above implementation.
In order to create a wallet, first you need to create a profile, which in essense is a mn…
-
```
Does not work with android 4.0 takes up 1/4 the screen only
```
Original issue reported on code.google.com by `jamez243` on 2 Dec 2011 at 2:50
-
```
Does not work with android 4.0 takes up 1/4 the screen only
```
Original issue reported on code.google.com by `jamez243` on 2 Dec 2011 at 2:50
-
### What is the issue?
When I load a large model that doesn't fit in VRAM, Ollama crashes:
➜ ~ ollama run dbrx:132b-instruct-q8_0
Error: llama runner process has terminated: signal: segmentation …
-
-
### Have you searched for similar requests?
Yes
### Is your feature request related to a problem? If so, please describe.
Many multilingual LLMs unintentionally mix languages when you ask it …
-
**Date**: TBD
**Time**: TBD
**Location**: TBD
**Vidyo for remote participants** (just click on the link to proceed): [CERN_OpenScience](http://vidyoportal.cern.ch/flex.html?roomdirect.html&key=Y71x6cf…