-
**What is the URL, file, or UI containing proposed doc change**
https://github.com/vllm-project/llm-compressor/tree/main/examples/quantization_w8a8_int8
**What is the current content or situation …
-
- [ ] Metadata extraction - HM
- [ ] Answering introductory questions - GK
- [x] Answering Open Chat questions - moved to genai-apps/aggrag#36
-
(glm-130b) ➜ GLM-130B git:(main) ✗ bash scripts/evaluate.sh tasks/bloom/glue_cola.yaml
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment …
-
运行python generate.py --model_type chatglm --size 7 时可以正常起来,里面对应的chatglm的lora_weights已经写死。
后面输入instruction以后,运行报错:
Traceback (most recent call last):
File "/root/llm/Alpaca-CoT-main/generate.py", …
-
I think there are a few issues being conflated here and it would be helpful to disentangle them:
We support:
- launching with `accelerate launch`, which is only meant to support Data-parallel …
-
[X] I checked the [documentation](https://docs.ragas.io/) and related resources and couldn't find an answer to my question.
https://docs.ragas.io/en/stable/concepts/metrics/critique.html
**Your …
-
Hi everyone,
I have a question about LLM contribution, in the picture. this is Perturbation-based Attribution method. the basic idea is that replace the words in order for example
`I love you` …
-
I have been experiencing llm-foundry/eval takes a lot of time compared to lm-evaluation-harness. After digging into the code, I found padding token is appended till the maximum length of the tokenizer…
-
- [ ] [MoAI/README.md at master · ByungKwanLee/MoAI](https://github.com/ByungKwanLee/MoAI/blob/master/README.md?plain=1)
# MoAI/README.md at master · ByungKwanLee/MoAI
## Description
![MoAI: Mixture…
-
### System Info
- GPU:L40S
- Tensorrt-llm:0.11.0.dev2024060400
- cuda:cuda_12.4.r12.4/compiler.34097967_0
- driver:535.129.03
- os:DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"(docker)
-
### Who…