-
Hi,
I am working on [5_mins_rag_no_gpu](https://github.com/NVIDIA/GenerativeAIExamples/tree/main/examples/5_mins_rag_no_gpu).
Facing this error: AttributeError: 'NVIDIAEmbeddings' object has no attr…
-
Thanks for this project, it looks really promising.
I just started using it, and here's what I found, example is this repo:
```
> gt 'data_file_path' --context 0 --max-results 3
─────────────…
-
Hi, team, I would like to know how to load and dump a sharded embedding collection via `state_dict`. Basically
1. How many files should I save? Should each rank have an exclusive sharding file or o…
-
cubiq released new version of ComfyUI ipadapter recently. He walked though the various updates in https://www.youtube.com/watch?v=_JzDcgKgghY. I noticed some features are very handy and should be port…
-
**Description**
Hi Team,
I tried to config my ensemble model with reshape : https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html#resha…
-
Instead of integrating, together, openrouter, eden, etc? Why not just integrate APIpie to access them all from one place and plus it also provides:
-The most affordable, reliable and fastest AI av…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### OS
Linux
### GPU
amd
### VRAM
8GB
### What version did you experience this issue on?
3…
-
If the "chat_model" Setting (or "embedding_model", "summarization_model", etc.) don't exist at runtime, services may get `nil` back when they call `Setting.chat_model`. In this case, the app will die …
-
Hi All,
Thank you for your amazing work.
We have an encoder decoder model we want to run using TensorRT-LLM. We made an architectural modification by pooling the encoder's output dim using stacked MLP…
-
This is a challenging issue that I've been working on...First, here is my entire script:
SCRIPT
```
import shutil
import yaml
import gc
from langchain_community.docstore.document import Do…