-
reference: https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/blogs/quantization-in-TRT-LLM.md#performance
![image](https://github.com/user-attachments/assets/1bb20225-3eb2-4641-b5ba-f027…
-
6b.eleuther.ai mystic model is down for GPT-J-6B.
-
**Describe the bug**
Initializing an InferenceEngine for GPT-J fails with the following error:
```
RuntimeError: view size is not compatible with input tensor's size and stride (at
least one d…
-
running `scripts/inference.py` throws the following error
```
return forward_call(*args, **kwargs)
File "/home/gongai/projects/wgong/PromptFix/./stable_diffusion/ldm/modules/attention.py", li…
wgong updated
7 hours ago
-
We found that [evaluation.py](https://github.com/mlcommons/inference/blob/master/language/gpt-j/evaluation.py) is not deterministic.
I narrowed down to small and fast reproducer using 100 examples …
-
The script [https://github.com/mlcommons/inference/tree/master/language/gpt-j](https://github.com/mlcommons/inference/tree/master/language/gpt-j) refers to [https://github.com/badhri-intel/inference](…
-
Feature request
Nous Research and EleutherAI have released the YaRN model, which comes in two versions with context sizes of 64k and 128k. This model utilizes RoFormer-style embeddings, distinguishin…
-
![~((61ETO2})P`HK 3}M0$YO](https://github.com/RVC-Boss/GPT-SoVITS/assets/1711794/8b3d76e9-5960-4970-a955-f6bbdf02ec12)
![JUA`_U3E4O HTFKXZ)_5C2F](https://github.com/RVC-Boss/GPT-SoVITS/assets/17117…
-
Hello
Are there any plans to add `no_repeat_ngram_size` parameter for generating in gpt-j?
https://github.com/NVIDIA/FasterTransformer/issues/348#tasklist-block-a127ad7d-3bce-468d-8bd2-5756e5ab9a…
-
솔루션이랑도 비슷해 보이고 제 눈에는 문제가 없어보이는데 인덱스 오류가 뜨고 오류 설명을 봐도 잘 모르겠어요.. GPT 요친구도 작동이 안되네요 도와주세용,,
```python
def MxMmatrix(M,r):
return np.random.randn(M,r) @ np.random.randn(r,M)
r_step = range(2,1…