-
### Issue Description
I'm experiencing issues with downloading the **Llama3-8B-Lexi-Uncensored:fp16** model. The error message indicates a 503 Server Error: Service Temporarily Unavailable when attem…
-
### Check for existing issues
- [X] Completed
### Describe the bug / provide steps to reproduce it
Using OpenAI models with assistant. Retrying will sometimes work.
### Environment
Zed: v0.155.2 …
-
Discussion for this in #373 and #284.
The export script in sharktank was built specifically for llama 3.1 models and has some rough edges. Along with this, it requires users to chain together cli c…
-
I am using a docker container built using `docker run --gpus all --name paddlex -v $PWD:/paddle --shm-size=8g --network=host -it registry.baidubce.com/paddlex/paddlex:paddlex3.0.0b1-paddlepaddle3.0.0b…
-
**Description**
I have noticed that there was a huge difference in memory usage for runtime buffers and decoder for llama3 and llama3.1.
**Triton Information**
What version of Triton are you usin…
-
**Scenario:**
1. I have pushed my Power BI project to an Azure DevOps repository. The repo contains:
- `.Reports` folder
- `.SemanticModel` folder, which includes the `model.bim` file with SQL …
-
_See [mesheryctl Command Tracker](https://bit.ly/3dqXy1q) for current status of commands._
### Current Behavior
Inconsistent mesheryctl behavior. The second response is the valid response. Meshe…
-
### Issues Policy acknowledgement
- [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md)
### Where…
-
**Describe the bug**
OpenAI API endpoint is "/v1/chat/completions", but OVMS endpoint is "/v3/chat/completions".
most of existing application doesn't allow user to modify the prefix “**V1**” to "**…
-
/kind feature
**Describe the solution you'd like**
Currently it is not possible to specify at what path the downloaded model should be available in the model server container. The downloaded model…