issues
search
PygmalionAI
/
aphrodite-engine
Large-scale LLM inference engine
https://aphrodite.pygmalion.chat
GNU Affero General Public License v3.0
1.14k
stars
127
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
feat: multi-step scheduling
#831
AlpinDale
closed
13 hours ago
0
fix: unbound tokenizer error
#830
AlpinDale
closed
16 hours ago
0
feat: add metrics for prefix cache hit rate
#829
AlpinDale
closed
2 days ago
0
feat: add cuda sampling kernels for top_k and top_p
#828
AlpinDale
closed
2 days ago
0
feat: Add DRY (Do not Repeat Yourself) sampling
#827
selalipop
opened
2 days ago
10
fix: sampler test with new transformers version
#826
AlpinDale
closed
2 days ago
0
feat: implement top-nsigma sampling method
#825
AlpinDale
closed
2 days ago
7
SPMD optimizations
#824
AlpinDale
closed
2 days ago
0
feat: support chunked prefill with LoRA
#823
AlpinDale
closed
4 days ago
0
feat: add chat method for LLM class
#822
AlpinDale
closed
4 days ago
0
fix: tokenization api test
#821
AlpinDale
closed
4 days ago
0
[Tracker]: Passing all unit tests
#820
AlpinDale
opened
4 days ago
0
build(deps): bump cross-spawn from 7.0.3 to 7.0.5 in /docs
#819
dependabot[bot]
opened
4 days ago
0
fix: --max-seq-len-to-capture arg
#818
AlpinDale
closed
5 days ago
0
Some fixes
#817
Naomiusearch
opened
6 days ago
0
[Bug]: Argument --max-seq_len-to-capture not recognized
#816
Nero10578
closed
5 days ago
1
[Installation]: Cannot find CUDA_TOOLKIT_ROOT_DIR while trying to build for ROCm
#815
RuntimeRacer
opened
1 week ago
1
fix: temperature issues
#814
50h100a
closed
1 week ago
0
Mask dynatemp using min/max, rather than exp
#813
50h100a
closed
1 week ago
0
[Usage]: Aphrodite Engine: KV Cache Context Length Issue with Quantized Models
#812
murtaza-nasir
closed
1 week ago
1
feat: add Tencent Hunyuan model support
#811
AlpinDale
opened
1 week ago
0
[Bug]: v0.6.3(.post1?) regression
#810
dirkson
opened
1 week ago
0
[Bug]: 0.6.3.post1 regression: RuntimeError during mem profiling on Mistral Large AWQ with `-q awq_marlin`
#809
khanonnie
opened
2 weeks ago
2
feat: update to serviceinfo v0.2
#808
AlpinDale
closed
2 weeks ago
0
feat: add serviceinfo endpoint
#807
AlpinDale
closed
2 weeks ago
0
[Misc]: log input and output
#806
Eve-146T
opened
2 weeks ago
0
frontend: add an `ai-plugin.json` route
#805
AlpinDale
closed
2 weeks ago
1
[Bug]: .\gguf_to_torch.py broken along with direct load GGUF
#804
sorasoras
opened
2 weeks ago
2
frontend: enable kobold api by default
#803
AlpinDale
closed
2 weeks ago
0
[Bug]: The documentation page is down and empty
#802
puppetm4st3r
opened
2 weeks ago
5
ci: bump to 0.6.3.post1
#801
AlpinDale
closed
2 weeks ago
0
fix: compilation of gptq_marlin_gemm object
#800
AlpinDale
closed
2 weeks ago
0
ci: bump version to 0.6.3
#799
AlpinDale
closed
2 weeks ago
0
feat: add TP support for bitsandbytes
#798
AlpinDale
opened
2 weeks ago
0
fix: kobold lite embedded UI on windows
#797
AlpinDale
closed
2 weeks ago
0
build(deps): bump rollup from 4.21.0 to 4.24.3 in /docs
#796
dependabot[bot]
closed
2 weeks ago
0
feat: add HQQ quantization support
#795
AlpinDale
closed
2 weeks ago
0
fix: windows wheel url
#794
AlpinDale
closed
2 weeks ago
0
[Usage]: Distributed Inference Without Docker.
#793
Abdulhanan535
opened
3 weeks ago
3
[New Method]: VPTQ, Vector Post-Training Quantization
#792
YangWang92
opened
3 weeks ago
2
[Installation]: Unable to make openvino / CPU install from source work: "Failed to import from aphrodite._C with No module named 'aphrodite._C'"
#791
bolaft
opened
3 weeks ago
0
feat: windows support
#790
AlpinDale
closed
2 weeks ago
8
[Bug]: unable to load 14B Qwen2.5 GGUF with newest version (0.6.2.post1)
#789
NeoChen1024
opened
4 weeks ago
1
[Bug]: strange repetition issue
#788
ehartford
opened
1 month ago
6
frontend: minor logging improvements
#787
AlpinDale
closed
2 weeks ago
0
[Bug]: Several errors when deploying GGUF models
#786
musoles
opened
1 month ago
0
Stream models rather than load them completely into RAM.
#785
50h100a
closed
1 month ago
2
[Installation]: FYI: they fixed the stupid conda pytorch-cuda=12.4 / cuda 12.4.1 strict dependency issue
#784
BlairSadewitz
opened
1 month ago
0
[Bug]: Impossible dependency requirement with GGUF
#783
musoles
opened
1 month ago
0
[Bug]: Metrics incorrect when having zero throughput
#782
mrseeker
opened
1 month ago
0
Next