-
Currently in this R version of `bayesbench` inference engines are defined like this
```R
stan_vb
-
Hi,
Are you planning making textgrad llm calls asynchronous?
I tried to start adding saynchronous methods to make at least evaluation calls and inference (everything that is forward) asynchrono…
ajms updated
3 months ago
-
There is a number of issues with current TRT acceleration path in MONAI:
- For some networks it's only practical/possible to trace/export certain sub-module, like image_encoder. Current solution r…
-
### System Info
NVIDIA GeForce 4090 Gpu
### Who can help?
_No response_
### Information
- [ ] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supported …
-
### Jan version
0.5.4 AppImage
### Describe the Bug
I can't start any local models on my machine after the latest update. The previous version worked fine with various models.
### Steps to Reprodu…
-
Hello,
I am running some latency benchmarks using TensorRT-LLM on a Mistral 7B Instruct v0.3 model. My hope was that at small batch sizes the overall inference latency should not be impacted as much,…
-
**Is your feature request related to a problem? Please describe.**
From a tooling standpoint, we need the ability to discover all running LLM endpoints, so we can pick one and use it as an AI assista…
-
I`m not quite familiar with the Transformer model. There are more steps to do than other model with the Encoder and Decoder. Such as the last encoder block output needs to be as the input for the nex…
-
### System Info
py3.10
infinity-emb 0.0.55
Running with optimum engine fails:
```
INFO 2024-09-13 15:17:02,874 datasets INFO: PyTorch version 2.4.0 available. …
rawsh updated
3 weeks ago
-
## Use case
Following up on #6805
It would be useful to be able to create custom suggestions in the platform, rather than asking the OpenCTI developers to include new ones on a case by case basi…