-
We are using Triton Inference Server for model inference and currently facing throughput bottlenecks with LLM inference. I saw in a public video that Nvidia has optimized LLM serving by supporting `In…
-
Hello authors,
Thanks for sharing fantastic jobs. Now I would like to ask where this dataset came from, can you share a link or data?
"/lustre/fsw/portfolios/nvr/projects/nvr_elm_llm/dataset/video…
-
**Goal**: To collaborate with more RL researchers on ScrimBrain
**Problem**: @wkwan writes like a zoomer
**Solution**: Generate a scientific paper for ScrimBrain using LLM's and multimodal model…
wkwan updated
3 months ago
-
I'm trying to understand this in context of other works in the ecosystem. For example, I'm interested in video. For the video encoder, there is the LoRa tuned and the Fully-finetuned, can I use the em…
-
### Short description of current behavior
mindsdb.integrations.libs.ml_exec_base.MLEngineException: [litellm/litellm_handler_messages]: BadRequestError: OpenAIException - Error code: 400 - {'error': …
-
```
got prompt
WARNING: AnyNode.IS_CHANGED() got an unexpected keyword argument 'prompt'
RUN-1 01-ai_Yi-1.5-9B-Chat-16K-8_0bpw_exl2 Take the input and multiply by 5 1 2
Finding Nodes in Work…
-
Why does the whisper model need 17GB of video memory? fast-whipser only needs 4G video memory? And I haven't found a way for whisper to quantize Int. Is it not supported now? This video memory occupie…
-
**Describe the bug**
I am training a video-llm model, where I encode log videos with a varying number of forward passes to avoid OOM issues. I would like to use ZeRO3, but using a part of the model a…
-
### Software
Desktop Application
### Operating System / Platform
Windows
### Your Pieces OS Version
Pieces OS 10.1.10
### Early Access Program
- [ ] Yes, this is related to an Early Access Prog…
-
Hello Everyone !
Did you try to make it run on a local Setup ?
I tried by replacing the Configlist by a local one
#create the Configuration of your environnement
config_list = [
{
…