-
hej,
First of all, congratulations on the nice job you've done with this package😺
Secondly, I was wondering if you would be willing to accept an extension of linear quantization to support sign…
-
Do you have any additional suggestions on how to optimize performance apart from what you wrote in the article?
-
### Describe the issue
We are interested in your longllmlingua results on longbench. We referred to these two parts of your code[https://github.com/microsoft/LLMLingua/blob/main/experiments/llmli…
-
### Describe the bug
I used the code in the README and also in the notebook.
Check the code below.
### Steps to reproduce
```python
from langchain_community.document_loaders import TextLo…
-
I am trying to save a quantized ternary model to a `.tflite` file, but larq doesn't seem to save the weights using datatypes with a reduced precision and thus compress the file size.
However, after c…
-
Hello everyone, I implemented save index into folder local, store in my laptop. Then, I could load them again. However, I have to struggle on load them because it can not find `docstore`, `index_store…
-
# Trending repositories for C#
1. [**AvaloniaUI / Avalonia**](https://github.com/AvaloniaUI/Avalonia)
__Develop Desktop, Embedded, Mobile and WebAssembly apps with C# and XAML. Th…
-
### Feature request
PagedAttention has been a mainstream optimization technology for generation task based on LLMs. It has been supported by a lot of server engines, e.g., [vllm](https://github.co…
-
🚀 *Dynamic Personas support* 🌟
Requirements for the personas:
- Phase 1: Still client-side, but some support for server-side
- Dynamic persona support, remove hardcoding
- Stored in a state sto…
-
### OpenVINO Version
2024.4.0-16579-c3152d32c9c-releases/2024/4
### Operating System
Windows System
### Hardware Architecture
x86 (64 bits)
### Target Platform
Host Name: …
ghost updated
2 weeks ago