-
### Answers checklist.
- [X] I have read the documentation [ESP-IDF Programming Guide](https://docs.espressif.com/projects/esp-idf/en/latest/) and the issue is not addressed there.
- [X] I have up…
-
```py
from unsloth import FastLanguageModel
from unsloth import is_bfloat16_supported
import torch
from unsloth.chat_templates import get_chat_template
from trl import SFTTrainer
from transform…
-
Title: Refactor Backend Folder Structure for Enhanced Maintainability and Scalability
Description: The current backend folder structure can be optimized to improve code maintainability, scalability…
-
-
I'd like to explore the best approach for managing multi-client connections in both single and multi-GPU environments.
Often, GPUs are underutilized by a single client, especially when smaller mode…
-
I'm currently using this model for inference, and I would like to know how to generate inference results in batch mode. Specifically, I'm trying to avoid processing inputs one by one and instead proce…
-
When i ran quantize code for llama3-70b-instruct. It was successfull, but when i used vllm load quantized model. I got a warning: `awq quantization is not fully optimized yet. The speed can be slower …
-
Hi Currently the Pattern Rewriter/Matcher does not match contrib ops.
e.g:
def match(op,x,w,b):
x = op.Conv(x,w,b)
msft = onnxscript.values.Opset("com.microsoft", 1)
x…
-
This is a "living issue". Editing is appreciated.
### Context:
- Most prominent benchmark for embedding models: https://huggingface.co/spaces/mteb/leaderboard
- We can choose to index the pdf dat…
-
**Describe the bug**
After running the command "python stable_diffusion.py --provider cuda --optimize --model_id stabilityai/stable-diffusion-2-1" in Olive/examples/stable_diffusion/ directory.
floa…
xhcao updated
3 months ago