-
If you have plans to develop this project further, I would like to suggest a 4.6-bit scheme.
https://www.mdpi.com/2227-7390/12/5/651
I think this is an interesting schematic that fits very well on a…
-
The following is an example of a group quantized matmul found in Vicuna (pulled from https://github.com/nod-ai/SHARK/issues/1630, closely related to the i4 IR attached in #12859).
```
#map = affine_…
-
### Your current environment
```text
The output of `python collect_env.py`
```
root@9b33a89c3857:/workspace/vllm-0.4.2# python collect_env.py
Collecting environment information...
PyTorch versi…
-
13b的模型跑起来,需要多少显存资源
-
Hey guys
Nice work with SEG-Y loader! At our team, we use our own library to interact with SEG-Y data, so I've decided to give a try to MDIO and compare the results of multiple approaches and libra…
-
## Problem
We don't publish aarch64 linux binaries so right now we still install ao=0.1
```
(myvenv) marksaroufim@rpi5:~/Dev/ao $ pip install torchao
Looking in indexes: https://pypi.org/simpl…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…
-
### Your current environment
```text
Collecting environment information...
[WARNING] Failed to create Level Zero tracer: 2013265921
WARNING 11-07 06:41:13 _logger.py:68] Failed to import from vllm…
-
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged…
-
All requests end with 'finish_reason': 'length' when the max_tokens=-1 parameter is set.
What could be the problem?
**Model**:
https://huggingface.co/IlyaGusev/saiga_mistral_7b_gguf/resolve/main/…