-
### Question
I modify the eval.py from https://gist.github.com/haotian-liu/db6eddc2a984b4cbcc8a7f26fd523187.
But I get this error.
How to solve it, thank you.
-
Take a look at the history of commits related to implementing `n_gqa` and undo/remove everything that is related to it
Checklist
- [X] `api/src/serge/utils/migrate.py` ✅ Commit [`ee41b29`](https:/…
-
Hello,
Have you checked for what happens when the n_heads != n_kv_heads? How does this affect the Rope rotation, MHA which now becomes GQA?
-
Hi Rajat, Suprosanna and Volker,
I'm trying to use your code for RTN scene graph generation on GQA. Specifically, I'm looking at the gqa_1.4 branch, but I didn't see a `requirement.txt` there.
A…
-
When I try to run inference using **APE-Ti**, I get the following error:
```
Traceback (most recent call last):
File "/home/jupyter/TIL-2024/vlm/train/APE/demo/demo_lazy.py", line 134, in
d…
-
Thanks for the repo
On trying to evaluate the model using `python main.py --expName "gqaExperiment" --finalTest --testedNum 1000 --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt`, it a…
-
```
[rank0]: Traceback (most recent call last):
[rank0]: File "Pai-Megatron-Patch-0925/toolkits/model_checkpoints_convertor/qwen/hf2mcore_qwen2_dense_and_moe_gqa.py", line 924, in
[rank0]: m…
-
### Describe the issue
I’m trying to quantize a int4 model, but this file only provides the weight-only-quantization. If I can quantize both weight and activation to int4 ?
https://github.com/micros…
-
I am trying to execute the following script:
1. from llama_cpp import Llama
2. llm = Llama(model_path="~/llama-2-7b.ggmlv3.q8_0.bin", n_gqa=8)
3. output = llm("Q: Name the planets in the solar sy…
-
No GQA implementation is found, so the model is not capable to scale to 70B for composerLLAMA.
Maybe we need design GQA and introduce head_z for wq and head_z_kv for wk and wv?