-
I only modified t6 instead of t4, t4 t5 both work well for this model,but if we set the thread=6,will always trigger the problem on my XIAOMI14Pro(SM8650 8Gen3)
please check it for resolve
thanks~
…
-
### Describe the bug
Function __post_carryover_processing(chat_info: Dict[str, Any]) of chat.py in agentchat folder throw the above exception when running Google Gemini.
The cause of the problem w…
-
i saw this error :
```
value is not a valid list (type=type_error.list))
Evaluating: 33%|█████████████████████████████▎ | 1/3 [01:37 0
v…
-
GPU: 2 ARC CARD
running following example,
[inference-ipex-llm](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Pipeline-Parallel-Inference)
**for mistral and codell…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I don't use huggingface because of proxy issue. So, I tried to make local embeddin…
-
### System Info
python=3.11.7
### 🐛 Describe the bug
import pandas as pd
from pandasai import SmartDataframe
from langchain_community.llms import Ollama
# Sample DataFrame
data = {
'Mont…
-
I'm trying to make the model generate emojis using this command:
```
./run.sh $(./autotag local_llm) python3 -m local_llm.chat --api=mlc --model=NousResearch/Llama-2-7b-chat-hf --prompt="Repeat th…
-
The package complains about "torch" not being installed when it is most definitely installed.
(.env) chris@localhost:~$ pip install flash_attn
Collecting flash_attn
Using cached fla…
-
### System Info
ubuntu 20.04
tensorrt 10.0.1
tensorrt-cu12 10.0.1
tensorrt-cu12-bindings 10.0.1
tensorrt-cu12-libs 10.0.1
tensorrt-llm 0.10.…
-
no_gt retrieval metrics needs large amount of LLM processing.
So, use local LLM model to compute it.
+ ragas context precision need so much LLM calls. So, try to use tonic validate instead.