-
### System Info
Running docker image version 2.4.0 with eetq quantization
Model: microsoft/Phi-3.5-mini-instruct
```
{"model_id":"microsoft/Phi-3.5-mini-instruct","model_sha":"af0dfb8029e8a7454…
-
Management of original fonts, generation of @font-face versions (eot, woff, ttf, svg) and the CSS syntax (like fontsquirrel), compression
-
As we were running the model in our local, so we provided the path of llama model that we downloaded, but while deploying it on Hugging Face how can we actually get it done?
Can anyone please help?…
-
Hi team,
I would like to use the LogitsPostProcessor in the [C++ Executor API](https://github.com/NVIDIA/TensorRT-LLM/blob/main/cpp/include/tensorrt_llm/executor/executor.h) to control the generatio…
-
That'd be cool, wouldn't it? You know it would be.
-
[ ] I have checked the [documentation](https://docs.ragas.io/) and related resources and couldn't resolve my bug.
**Describe the bug**
Tried Generation Test Set from Together APIs and Hugging Face…
-
### Feature description
I assumed the folder option worked this way, but it seems it applies the faces listed to a single image.
Is it possible to add a checkbox under the folder option, so that w…
-
I am exploring the development of a Retrieval-Augmented Generation (RAG) application for Android and am considering using local language models from Hugging Face’s TFLite models. I am looking for guid…
-
Some background tasks, like thumbnail generation and face detection use a lot of CPU time, and by extension a lot of battery. Fotema should be able to suspend these tasks if the user's device is in lo…
-
### Prerequisites
- [X] Write a descriptive title.
- [X] Make sure you are able to repro it on the latest version
- [X] Search the existing issues.
### Steps to reproduce
While using `OpenS…