-
### What behavior of the library made you think about the improvement?
I need to install torch, transformers, accelerate etc. even if I want to use outlines only with llamacpp backend.
Are these d…
-
### What happened?
If creating a llama model in python code, you can specific n_gpu_layers=-1 so that all layers are offloaded to GPU. (see below example) When starting llama cpp server using the doc…
-
**The bug**
When using `models.LlamaCpp` the selected tokenizer is always gpt2 (This can be seen in the outut when `verbose=True` arg is set). I have pasted the dumped KV metadat keys
```
llama_mod…
-
Hi,
I am unable to import LlamaCpp in IPEX
CODE : from ipex_llm.langchain.llms import LlamaCpp
ERROR
Cell In[5], [line 1](vscode-notebook-cell:?execution_count=5&line=1)
----> [1](vscode-note…
-
### 起始日期 | Start Date
_No response_
### 实现PR | Implementation PR
_No response_
### 相关Issues | Reference Issues
_No response_
### 摘要 | Summary
Hi, I am trying to load this using Llama.CPP HTTP s…
-
### Describe the problem you're trying to solve
Proof of Concept (PoC) a generic inference container that uses Triton as the inference engine and can download and utilize a ModelKit as efficiently as …
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
Normally, if one were starting a llama cpp server, one would specify the chat template a…
-
## Goal
- We should have more semantic naming format for Cortex engines
- e.g. `llamacpp-engine` instead of `cortex.llamacpp`
## Tasklist
- Discussion: https://github.com/janhq/cortex.cpp/discussio…
-
**Pages**
All of the pages
**Success Criteria**
Updated and simplify
## Tasklist
- [ ] cortex.cpp README
- [x] https://github.com/janhq/cortex.so/issues/82
- [x] https://github.com/janhq/…
-
**The bug**
A strings containing certain unicode characters to causes an exception.
Likely because `歪` is a multi-token characters for this tokenizer
```
llama3.engine.tokenizer('歪'.encode('utf8')…