-
**Describe the bug**
`RecursionError: maximum recursion depth exceeded while getting the str of an object`
**Expected behavior**
I want to the convert a LlaMa model into ONNX and then benchmark i…
-
While finalizing [CM-MLPerf BERT inference benchmark tutorial for SCC'23](https://github.com/mlcommons/ck/blob/master/docs/tutorials/scc23-mlperf-inference-bert.md) here a few missing things that we c…
-
**Describe the bug**
When running the steps from the ultralytics yolov8 tutorial: https://github.com/neuralmagic/sparseml/blob/main/integrations/ultralytics-yolov8/tutorials/sparse-transfer-learning.…
-
Hi.
The paper describes 8-bit quantization combined with pruning, which is fantastic.
My question: has any research been done for 4-bit quantization? Since GPU memory is notoriously expensive, 4…
-
## UI v2.0 Improvements
- [x] Proxy SSO Login - Add OpenID, Azure AD https://github.com/BerriAI/litellm/issues/1658
- [x] [Feature-UI]: UI Static Web App on Proxy
- [ ] [Feature-UI]: Show users t…
-
Tracking list of new models / endpoints / providers we plan on adding this week.
Comment any new models/providers/endpoints you want us to add below 👇
- [ ] Predibase lorax support - https://git…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…
-
## Budget Management
- [x] Add testing for `proxy_server.track_cost_callback` - writing to SpendLogs Table https://github.com/BerriAI/litellm/issues/1584
- [x] Allow Users to Set Budgets per user…
-
The [MLCommons taskforce on automation and reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md) is helping the community, vendors and submitters check if it is possible to r…
-
**Describe the bug**
Error converting mistral to onnx
**Expected behavior**
```
!pip install virtualenv
!virtualenv myenv
!source /content/myenv/bin/activate
!git clone https://github.com/n…