-
GPU: 2 ARC CARD
running following example,
[inference-ipex-llm](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Pipeline-Parallel-Inference)
**for mistral and codell…
-
### 🚀 The feature, motivation and pitch
```
warnings.warn(
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, …
-
Hi, i tried the example code to see if the scraper works but it always return a validation error for attribute `top` which is supposed ton be an array.
here are my example code, a bit tweaked to u…
-
### 🐛 Describe the bug
Hi,
We use `torch.compile` to run GPTJ3.6B model training on our GPU platforms, while we got some dynamo errors and the process aborted. The error is happening when runnin…
-
### System Info
I am experimenting with TRT LLM and `flan-t5` models. My simple goal is to build engines with different configurations and tensor parallelism, then review performance. Have a DGX syst…
-
can ollama URL be configured to point to remote box?
or try use ssh tunnel to make remote ollama appear to be local
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
How to connect to the Neptune database through llama_index in my local machine?
**Bel…
-
### Describe the issue
Ask what version of pyautogen will support 'register_for_llm' later, because I'm using the local model chatGLM, needs openai float:
if base_currency == quote_currency:
…
-
### Describe the bug
when providing an assistant ID for GPTAssistantAgent. the code pathway at line 117 always has a None value for variables "instructions" and "specified_tools". this is because the…
-
ToolCall is not generating from the response of llama 3.1 model from LM Studio, when using langchain framework connecting through ChatOpenAI ,
Same Tool call is working fine with ollama for the same …