-
## 🐛 Bug
When running Phi-3.5-mini-instruct,Mistral-Nemo-Base-2407 and Qwen2.5-7B-Instruct with NeMo + ThunderFX and constant folding enabled we get error:
> File ".1546", line 7, in forwar…
-
GPU: 2 ARC CARD
running following example,
[inference-ipex-llm](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Pipeline-Parallel-Inference)
**for mistral and codell…
-
> > Specify the local folder you have the model in instead of a HF model ID. If you have all the necessary files and the model is using a supported architecture, then it will work.
> > …
-
no_gt retrieval metrics needs large amount of LLM processing.
So, use local LLM model to compute it.
+ ragas context precision need so much LLM calls. So, try to use tonic validate instead.
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
How to connect to the Neptune database through llama_index in my local machine?
**Bel…
-
For #4 (Milestone: 1)
Contribute DevOps Roadmap data in the format of [frontend.json](https://github.com/Open-Source-Chandigarh/sadakAI/blob/main/finetune_data/frontend_data.json), the file should be…
-
can ollama URL be configured to point to remote box?
or try use ssh tunnel to make remote ollama appear to be local
-
Loosely defined:
- Tools: The bridge between LLM output -> An action, like read/writes to a database. Any function can be a tool.
- Agent: Executes a workflow given a task, restricted by a fixed set…
-
### System Info
I am experimenting with TRT LLM and `flan-t5` models. My simple goal is to build engines with different configurations and tensor parallelism, then review performance. Have a DGX syst…
-
ToolCall is not generating from the response of llama 3.1 model from LM Studio, when using langchain framework connecting through ChatOpenAI ,
Same Tool call is working fine with ollama for the same …