-
### System information
Currently all new ONNX op defining should use onnx/defs/ c++ API. If people want to reuse this and define their own domain and custom op, it is not convenient. They code co…
-
### Search before asking
- [X] I have searched the HUB [issues](https://github.com/ultralytics/hub/issues) and found no similar bug report.
### HUB Component
Training
### Bug
![image]…
-
## Description
What's the solutions of concurrency for AI model inference in DJL? Multithreads can access a model in the same time? Support Nvidia Triton?
Will this change the current api? How…
-
Hello! First of all, great job with this inference engine! Thanks a lot for your work!
Here's my issue: I have run vllm with both a mistral instruct model and it's AWQ quantized version. I've quant…
-
I Have Tier 2 subscription for Anthropic and this limit
Model__________________________| Requests per Minute | Tokens per Minute | Tokens per Day
Claude 3.5 Sonnet 2024-10-22 | 1,000_____________…
-
Collection could be carried out in parallel to inference.
Collection Queue:
1 - collect the comments by subreddit (can be multithreaded by subreddit) -> inference queue
2 - carry out edge disco…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui
### Have you updated WebUI and this exte…
-
### Self Checks
- [X] This template is only for bug reports. For questions, please visit [Discussions](https://github.com/fishaudio/fish-speech/discussions).
- [X] I have thoroughly reviewed the proj…
-
它有两种接口`/v1/completions`和`/v1/chat/completions`,前者自定义渠道也不能用,不知道该如何添加渠道?
```
curl --request POST \
--url https://api.fireworks.ai/inference/v1/completions \
-H 'Accept: application/json' \
-H…
-